請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5254
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 盧信銘(Hsin-Min Lu) | |
dc.contributor.author | Kai-Ti Chang | en |
dc.contributor.author | 張凱迪 | zh_TW |
dc.date.accessioned | 2021-05-15T17:54:27Z | - |
dc.date.available | 2015-02-03 | |
dc.date.available | 2021-05-15T17:54:27Z | - |
dc.date.copyright | 2015-02-03 | |
dc.date.issued | 2014 | |
dc.date.submitted | 2014-11-19 | |
dc.identifier.citation | [1] Onix text retrieval toolkit stopword list.
http://www.lextek.com/manuals/onix/stopwords1.html [2] M. Porter. An algorithm for su±x stripping. Program,14(3):130 - 137, 1980. [3] Wang, H., Lu, Y., & Zhai, C. (2010, July). Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 783-792). ACM. [4] Ma, G., & Qu, Y. (2012, October). A local LDA based method for Latent Aspect Rating Analysis on reviews. In Signal Processing (ICSP), 2012 IEEE 11th International Conference on (Vol. 3, pp. 2240-2245). IEEE. [5] D. Blei, A. Ng, and M. Jordan., (2003), Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993-1022,2003. [6] Dave, K., Lawrence, S., & Pennock, D. M. (2003, May). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. InProceedings of the 12th international conference on World Wide Web (pp. 519-528). ACM. [7] Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and trends in information retrieval, 2(1-2), 1-135. [8] Pak, A., & Paroubek, P. (2010, May). Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In LREC. [9] Hu, M., & Liu, B. (2004, July). Mining opinion features in customer reviews. InAAAI (Vol. 4, pp. 755-760). [10] Chaovalit, P., & Zhou, L. (2005, January). Movie review mining: A comparison between supervised and unsupervised classification approaches. In System Sciences, 2005. HICSS'05. Proceedings of the 38th Annual Hawaii International Conference on (pp. 112c-112c). IEEE. [11] Hu, M., & Liu, B. (2004, August). Mining and summarizing customer reviews. InProceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177). ACM. [12] Titov, I., & McDonald, R. (2008). A joint model of text and aspect ratings for sentiment summarization. Urbana, 51, 61801. [13] Lu, Y., Zhai, C., & Sundaresan, N. (2009, April). Rated aspect summarization of short comments. In Proceedings of the 18th international conference on World wide web (pp. 131-140). ACM. [14] Lin, C., & He, Y. (2009, November). Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 375-384). ACM. [15] Brody, S., & Elhadad, N. (2010, June). An unsupervised aspect-sentiment model for online reviews. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 804-812). Association for Computational Linguistics. [16] Jo, Y., & Oh, A. H. (2011, February). Aspect and sentiment unification model for online review analysis. In Proceedings of the fourth ACM international conference on Web search and data mining (pp. 815-824). ACM. [17] Su, Q., Xu, X., Guo, H., Guo, Z., Wu, X., Zhang, X., ... & Su, Z. (2008, April). Hidden sentiment association in chinese web opinion mining. In Proceedings of the 17th international conference on World Wide Web (pp. 959-968). ACM. [18] Angeliki Lazaridou, Ivan Titov and Caroline Sporleder, (2013), A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations. [19] Bing Lu, (2013), Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications) [20] CNET http://www.cnet.com/products/apple-iphone-5/user-reviews/ [21] W. Gerrod Parrott, (2001), Emotions in Social Psychology: Essential Readings [22] Bo Pang and Lillian Lee, (2008), Opinion Mining and Sentiment Analysis [23] Bo Pang and Lillian Lee, Shivakumar Vaithyanathan, (2002), Thumbs up? Sentiment Classification using Machine Learning Techniques [24] Bo Pang and Lillian Lee, (2005), Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of ACL 2005. [25] Peter D. Turney, (2002), Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews [26] B Liu, M Hu, J Cheng, (2005), Opinion Observer: Analyzing and Comparing Opinions on the Web [27] Minqing Hu and Bing Liu, (2004), Mining and Summarizing Customer Reviews [28] Yiming Yang ,Jan O. Pedersen, A Comparative Study on Feature Selection in Text Categorization [29] Samuel Brody, Noemie Elhadad, (2007), An Unsupervised Aspect-Sentiment Model for Online Reviews [30] Niu, Zheng-Yu, Dong-Hong Ji, and Chew-Lim Tan., (2007) I2r: three systems for word sense discrimination, chinese word sense disambiguation, and English word sense disambiguation. In SemEval '07: Proc. of the 4th International Workshop on Semantic Evaluations. ACL, Morristown, NJ, USA, pages 177-182 [31] Yue Lu, ChengXiang Zhai, Neel Sundaresan, (2009), Rated Aspect Summarization of Short Comments [32] Yi Wang, (2008), Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details [33] Prasanth Lade, (2011), Study of Latent Dirichlet Allocation (LDA) models and their application to Human Affective state recognition [34] 中文停止詞表 http://www.cnblogs.com/ibook360/archive/2011/11/23/2260397.html | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/5254 | - |
dc.description.abstract | 隨著網路科技的高速發展,網路上充滿著各式各樣的評論。如何針對這些非結構的資料進行分析也顯得日漸重要。然而在這些服務或產品評論當中,往往使用者只留下對於產品或服務的整體評論分數(overall rating),而沒有針對服務或產品的各主題面向(topical aspect)做分數的評比或是揭露使用者對於產品或服務的某一種主題面向的權重(weight),這樣對於使用者的幫助有限。而藉由分析文件的主題面向分數(topical aspect rating)和其權重(weight)的問題稱為潛藏面向評分分析(Latent Aspect Rating Analysis,簡稱:LARA)。
本研究試圖使用局部潛藏狄利克雷分配(Local Latent Dirichlet Allocation,簡稱:Local LDA)和潛藏評分迴歸模型(Latent Rating Regression,簡稱:LRR)將LARA分析應用於中文評論上。實驗共分為兩階段模型,第一階段使用Local LDA將經過前處理的評論內文進行面向的切割和和面向擷取,之後第二階段運用LRR模型以類似EM算法的形式試圖推論出文件的主題面向分數(topical aspect rating)和其權重(weight)。 本研究將使用華文最大的旅遊網站攜程網旅遊評論和全球最大的旅遊評論網站TripAdvisor為分析資料集,其中攜程網資料為使用網路爬蟲擷取後整理而成。實驗中我們可以發現Local LDA的方法比起Bootstrap相對較好,且Local LDA屬於非監督式學習,毋須人工手動設定種子關鍵詞,可以讓整個應用更加廣泛。 | zh_TW |
dc.description.abstract | As the growth of web technology, it’s an important task to mine the detailed information in the online reviews. Most reviewers only rating the entity with overall rating; however, it’s not enough for users to learn more from the reviews. As a result, there is a new problem called Latent Aspect Rating Analysis in text mining which analyzes latent aspect and latent aspect weight simultaneously.
In this research, we apply the LARA on the Chinese reviews. We use the Local LDA(unsupervised learning) and LRR model to analyze the online reviews. In the first stage, we use the Local LDA method on the review contexts to conduct the aspect segmentation after preprocessing. After the aspect segmentation, we can get the aspects and aspect representative words. In the second stage, we use the LRR model to infer the latent aspect rating and latent aspect weight. Our experiment uses the Ctrip and TripAdvisor online reviews as the dataset. The results demonstrate the Local LDA + LRR method has some advantage on Chinese LARA problems. | en |
dc.description.provenance | Made available in DSpace on 2021-05-15T17:54:27Z (GMT). No. of bitstreams: 1 ntu-103-R01725032-1.pdf: 1777356 bytes, checksum: 832751af9c8e411bfbd803bc9ba3a187 (MD5) Previous issue date: 2014 | en |
dc.description.tableofcontents | 口試委員會審定書 #
誌謝 i 中文摘要 ii Abstract iii 目錄 iv 圖目錄 vi 表目錄 vii 第一章 緒論 1 1.1 研究動機與背景 1 1.2 研究目的 3 1.3 研究架構 3 第二章 文獻探討 4 2.1 情感分析和意見探勘研究概述 4 2.2 文件和句子層次的分析 6 2.2.1 基於監督學習的文件層次情感分析 6 2.2.3基於監督學習的句子層次情感分析 7 2.3 主題面向層次的分析 7 2.3.1 主題面向的擷取 7 2.3.2 基於主題面向的情感分析 9 2.3.3 基於主題面向的評分分析 10 2.4 LARA問題研究 10 2.4.1基於Bootstrap的LARA分析 10 2.4.2基於Local LDA的LARA分析 17 2.5 小結 21 第三章 問題定義與系統設計 23 3.1 問題定義 23 3.2 系統設計 24 3.3 基準線模型 33 第四章 資料處理 34 4.1 資料來源 34 4.2 資料前處理 36 4.2.1攜程網資料 36 4.2.2 TripAdvisor資料 37 4.3 實驗描述 37 4.3.1 Local LDA模型 38 4.3.2 LRR模型 41 4.4 基準線模型實驗 42 第五章 實驗結果 47 第六章 結論與建議 51 6.1 實驗結論 51 6.2 研究貢獻 51 6.3 未來研究方向 52 參考文獻 53 | |
dc.language.iso | zh-TW | |
dc.title | 應用潛藏面相評分分析於中文評論:使用局部潛藏狄利克雷分配方法 | zh_TW |
dc.title | Latent Aspect Rating Analysis on Chinese Reviews: A Local LDA Based Approach | en |
dc.type | Thesis | |
dc.date.schoolyear | 103-1 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 曹承礎(Seng-cho T. Chou),李昇暾(Sheng-Tun, Li) | |
dc.subject.keyword | 文字探勘,潛在面向評分分析,潛藏狄利克雷分配,潛藏評分迴歸模型,情感分析,意見探勘,評論分析, | zh_TW |
dc.subject.keyword | text mining,Latent Aspect Rating Analysis,Latent Dirichlet Allocation,Latent Rating Regression Model,sentiment analysis,opinion mining,review mining, | en |
dc.relation.page | 56 | |
dc.rights.note | 同意授權(全球公開) | |
dc.date.accepted | 2014-11-19 | |
dc.contributor.author-college | 管理學院 | zh_TW |
dc.contributor.author-dept | 資訊管理學研究所 | zh_TW |
顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-103-1.pdf | 1.74 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。