Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90968
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor謝宏昀zh_TW
dc.contributor.advisorHung-Yun Hsiehen
dc.contributor.author洪贊濱zh_TW
dc.contributor.authorTsan-Pin Hungen
dc.date.accessioned2023-10-24T16:32:20Z-
dc.date.available2024-08-14-
dc.date.copyright2023-10-24-
dc.date.issued2023-
dc.date.submitted2023-08-11-
dc.identifier.citationF. Shi, Y. Cao, Y. Shang, Y. Zhou, C. Zhou, and J. Wu, “H2-fdetector: a gnn-based fraud detector with homophilic and heterophilic connections,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 1486–1494.
J. Pitman. (2022) Local consumer review survey 2022. https: //www.brightlocal.com/research/local-consumer-review-survey/?SSAID= 314743&SSCID=81k6 t41ah.
N. Hussain, H. Turab Mirza, G. Rasool, I. Hussain, and M. Kaleem, “Spam review detection techniques: A systematic literature review,” Applied Sciences, vol. 9, no. 5, p. 987, 2019.
S. K. Maurya, D. Singh, and A. K. Maurya, “Deceptive opinion spam detection approaches: a literature survey,” Applied intelligence, vol. 53, no. 2, pp. 2189–2234, 2023.
S. Rayana and L. Akoglu, “Collective opinion spam detection: Bridging review networks and metadata,” in Proceedings of the 21th acm sigkdd international conference on knowledge discovery and data mining, 2015, pp. 985–994.
F. Abri, L. F. Gutierrez, A. S. Namin, K. S. Jones, and D. R. Sears, “Fake reviews detection through analysis of linguistic features,” arXiv preprint arXiv:2010.04260, 2020.
Y. Dou, Z. Liu, L. Sun, Y. Deng, H. Peng, and P. S. Yu, “Enhancing graph neural network-based fraud detectors against camouflaged fraudsters,” in Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM’20), 2020.
A. Mukherjee, V. Venkataraman, B. Liu, and N. Glance, “What yelp fake review filter might be doing?” in Proceedings of the international AAAI conference on web and social media, vol. 7, no. 1, 2013.
N. Jindal and B. Liu, “Opinion spam and analysis,” in Proceedings of the 2008 international conference on web search and data mining, 2008, pp. 219–230.
I. Gunes, C. Kaleli, A. Bilge, and H. Polat, “Shilling attacks against recommender systems: a comprehensive survey,” Artificial Intelligence Review, vol. 42, no. 4, pp. 767–799, 2014.
C. Yuan, W. Zhou, Q. Ma, S. Lv, J. Han, and S. Hu, “Learning review representations from user and product level information for spam detection,” 2019.
G. Wang, S. Xie, B. Liu, and S. Y. Philip, “Review graph based online store review spammer detection,” in 2011 IEEE 11th international conference on data mining. IEEE, 2011, pp. 1242–1247.
J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International conference on machine learning. PMLR, 2017, pp. 1263–1272.
W. L. Hamilton, “Graph representation learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 14, no. 3, p. 51, 2020.
M. Wang, D. Zheng, Z. Ye, Q. Gan, M. Li, X. Song, J. Zhou, C. Ma, L. Yu, Y. Gai, et al., “Deep graph library: A graph-centric, highly-performant package for graph neural networks,” arXiv preprint arXiv:1909.01315, 2019.
Y. Liu, X. Ao, Z. Qin, J. Chi, J. Feng, H. Yang, and Q. He, “Pick and choose: A gnn-based imbalanced learning approach for fraud detection,” in Proceedings of the Web Conference 2021, 2021, pp. 3168–3177.
A. Li, Z. Qin, R. Liu, Y. Yang, and D. Li, “Spam review detection with graph convolutional networks,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 2703– 2711.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
A. Barushka and P. Hajek, “Review spam detection using word embeddings and deep neural networks,” in IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, 2019, pp. 340–350.
N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019. Online Available at: https://arxiv.org/abs/1908.10084
S. Shehnepoor, R. Togneri, W. Liu, and M. Bennamoun, “HIN-RNN: A graph representation learning neural network for fraudster group detection with no handcrafted features,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2021. Online Available at: https://doi.org/10.1109%2Ftnnls.2021.3123876
R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, “Graph convolutional neural networks for web-scale recommender systems,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 974–983.
M. Schlichtkrull, T. N. Kipf, P. Bloem, R. v. d. Berg, I. Titov, and M. Welling, “Modeling relational data with graph convolutional networks,” arXiv preprint arXiv:1703.06103, 2017.
S.-j. Ji, Q. Zhang, J. Li, D. K. Chiu, S. Xu, L. Yi, and M. Gong, “A burstbased unsupervised method for detecting review spammer groups,” Information Sciences, vol. 536, pp. 454–469, 2020.
Z. Wang, S. Gu, and X. Xu, “Gslda: Lda-based group spamming detection in product reviews,” Applied Intelligence, vol. 48, pp. 3094–3107, 2018.
Y. Dou, Z. Liu, L. Sun, Y. Deng, H. Peng, and P. S. Yu, “Enhancing graph neural network-based fraud detectors against camouflaged fraudsters,” in Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM’20), 2020.
A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864.
W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dolla ́r, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, jun 2015. Online Available at: https://doi.org/10.1109%2Fcvpr.2015.7298682
Y. Wu and Y. Liu, “Robust truncated hinge loss support vector machines,” Journal of the American Statistical Association, vol. 102, no. 479, pp. 974– 983, 2007.
M. Wang, D. Zheng, Z. Ye, Q. Gan, M. Li, X. Song, J. Zhou, C. Ma, L. Yu, Y. Gai, T. Xiao, T. He, G. Karypis, J. Li, and Z. Zhang, “Deep graph library: A graph-centric, highly-performant package for graph neural networks,” arXiv preprint arXiv:1909.01315, 2019.
J. Ni, J. Li, and J. McAuley, “Justifying recommendations using distantlylabeled reviews and fine-grained aspects,” in Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), 2019, pp. 188–197.
S. Zhang, H. Yin, T. Chen, Q. V. N. Hung, Z. Huang, and L. Cui, “Gcnbased user representation learning for unifying robust recommendation and fraudster detection,” 2020.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90968-
dc.description.abstract在垃圾評論檢測領域,基於圖的檢測法由於能捕捉評論間的互動關係而受到廣泛矚目。然而圖神經網路 (GNN)反覆聚合鄰點訊息的特導致過平滑的問題,使得良性與惡性評論的節點表示有可能趨同。雖然早前有 研究試圖透過同時考慮同質和異質連接來降低影響,嘗試反向聚合異質連接,但由於依然使用相同的聚合函數 同時聚合不同標籤的鄰點,且假設所有良惡評論節點表示應各自相近,導致未能有效避免過度平滑。此外,一 次性更新所有節點的表示在資料量增長時將導致記憶體需求過大,因此使用子圖聚合在實際應用中變得必不可 少。然而過去的方法在建構子圖時,並未考慮到圖的拓墣結構來進行鄰點採樣,因此無法有效補捉緊密交互的 鄰點之訊息。為了解決上述的問題我們提出了一種基於潛在關係的圖神經網路垃圾評論模型,該模型根據圖的 拓墣結構相似性對進行採樣產生子圖進行隨機訓練,在聚合鄰點訊息前先使用分類器分類出潛在良性與惡性評 論鄰點,接著使用分層的聚合策略,將潛在良性與惡性評論視為兩種不同的關係分開進行聚合後,再組合這兩 類評論鄰點的訊息進行下一層的聚合。同時,我們設計了一種新的三元損失函數,使良性評論的表示與評論對 象的表示之間的相似度高於與惡性評論節點的相似度,來降低過度平滑的影響,更符合現實中的觀察。我們的 實驗結果證明了我們方法的有效性,在 yelpNYC 資料集中使用隨機切分的情況我們的方法在 AUC 分數的表現 上平均高於主要參考模型 6%和次要參考模型 1.5%,達到了 0.84,而在按時間序切分的情況下,我們的 AUC 分 數上平均分別高於主要以及次要參考模型 5.5%以及次要參考模型 6.5%,在其他資料及上也都得到優於參考模型 的節結果,並且在每一次的實驗結果中的 AUC 的分數都優於其他兩者。zh_TW
dc.description.abstractGraph-based spam review detection has been appealing due to its ability to capture review interactions. However, it has problems with over-smoothing because the recurrent aggregation of neighborhood data makes it difficult to distinguish between benign and spam reviews. Although existing studies consider homogeneous and heterogeneous connections, but employ the same aggregation function and presume that benign and spam review representations should be similar, which results in inefficiencies. Additionally, updating all node representations at once becomes unfeasible as data quantities increase due to memory constraints, necessitating subgraph aggregation. However, prior approaches did not consider the topological structure of the graph in subgraph construction, making it difficult to capture information from closely interacting neighbors effectively. To address these issues, we present a GNN model for spam review detection based on potential labels to overcome these problems. According to the topology of the graph, our model sample subgraphs use a hierarchical aggregation strategy and treat potential labels of benign and spam reviews as two different relationships. We also designed a novel triplet loss function that ensures the similarity between the representation of benign review and the target of review is higher than that with spam review nodes, mitigating over-smoothing. Our experimental results demonstrate the effectiveness of our method. In the YelpNYC dataset, under random splitting, our approach outperformed the primary and secondary baseline models by 6% and 1.5% respectively on average AUC scores, achieving a score of 0.84; in the case of chronological splitting, our AUC scores were on average 5.5% and 6.5% higher than the primary and secondary baseline models respectively, achieving a score of 0.68. Our method also achieved superior results on other datasets and consistently exceeded the AUC scores.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-10-24T16:32:20Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-10-24T16:32:20Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsABSTRACT ii
LISTOFTABLES v
LISTOFFIGURES vi
CHAPTER1 INTRODUCTION 1
CHAPTER 2 BACKGROUND AND RELATED WORK 4
2.1 SpamReview 4
2.2 Graph-basedSpamReviewDetection 5
2.2.1 GraphNeuralNetwork 6
2.2.2 StochasticTrainingonGraphs 8
2.2.3 NeighborSampler 10
2.3 RelatedWork 10
2.3.1 GAS 10
2.3.2 H2-FDetector 11
2.4 Summary 12
CHAPTER3 SYSTEMMODEL 13
3.1 DatasetDescription 13
3.2 CommentGraphConstruction 14
3.2.1 ReviewContextRepresentation 15
3.2.2 EdgeRepresentation 15
3.2.3 NodeRepresentation 15
3.3 GraphSampling 16
3.4 Heterogeneous Graph Convolutional Network 17
3.4.1 AggregationStage 19
3.4.2 CombinationStage 20
3.4.3 SummaryoftheHGNNModel 20
3.5 Summary 22
CHAPTER4 METHODOLOGY 23
4.1 Motivation 23
4.2 ModelArchitecture 24
4.3 TopologyAwareGraphSampling 25
4.4 PL-RGNNModel 25
4.4.1 PotentialLabelIdentification 26
4.4.2 Relational Graph Attention Aggregation 28
4.5 Optimization 30
4.5.1 TripletLoss 31
4.5.2 FocalLoss 32
CHAPTER5 PERFORMANCEEVALUATION 34
5.1 ExperimentSetup 34
5.2 EvaluateMethod 35
5.2.1 Metrics 35
5.2.2 Visualization 36
5.3 AveragePerformanceAnalysis 36
5.4 ModelTrade-offAnalysis 45
5.5 EmbeddingVisualization 48
5.6 Performance Comparison of Different Embedding Methods with Baselines 50
5.7 AblationStudy 52
5.8 PerformanceComparisononAmazondataset 55
5.9 Summary 56
CHAPTER 6 CONCLUSION AND FUTURE WORK 57
REFERENCES 58
-
dc.language.isoen-
dc.title以帶潛在標籤的關係圖神經網絡改進垃圾評論之檢測zh_TW
dc.titleImproving Detection of Spam Reviews via Relational Graph Neural Networks with Potential Labelsen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.coadvisor王志宇zh_TW
dc.contributor.coadvisorChih-Yu Wangen
dc.contributor.oralexamcommittee黃瀚萱;蔡銘峰zh_TW
dc.contributor.oralexamcommitteeHen-Hsen Huang;Ming-Feng Tsaien
dc.subject.keyword關係圖神經網路,垃圾評論檢測,zh_TW
dc.subject.keywordRelational Neural Networks,spam review detection,en
dc.relation.page60-
dc.identifier.doi10.6342/NTU202304108-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2023-08-13-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資料科學學位學程-
dc.date.embargo-lift2024-08-14-
顯示於系所單位:資料科學學位學程

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf2.99 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved