請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96076完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 廖世偉 | zh_TW |
| dc.contributor.advisor | Shih-Wei Liao | en |
| dc.contributor.author | 郭政維 | zh_TW |
| dc.contributor.author | Cheng-Wei Kuo | en |
| dc.date.accessioned | 2024-10-14T16:04:20Z | - |
| dc.date.available | 2024-10-15 | - |
| dc.date.copyright | 2024-10-14 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-10-01 | - |
| dc.identifier.citation | R. Albalawi, T. H. Yeap, and M. Benyoucef. Using topic modeling methods for short-text data: A comparative analysis. Frontiers in artificial intelligence, 3:42, 2020.
D. Angelov. Top2vec: Distributed representations of topics. arXiv preprint arXiv:2008.09470, 2020. F. Bianchi, S. Terragni, and D. Hovy. Pre-training is a hot topic: Contextualized document embeddings improve topic coherence. arXiv preprint arXiv:2004.03974, 2020. D. Blei and J. Lafferty. Correlated topic models. Advances in neural information processing systems, 18:147, 2006. D. M. Blei and J. D. Lafferty. A correlated topic model of science. 2007. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of machine Learning research, 3(Jan):993–1022, 2003. J. Carbonell and J. Stewart. The use of mmr, diversity-based reranking for reordering documents and producing summaries. SIGIR Forum (ACM Special Interest Group on Information Retrieval), 06 1999. J. Chuang, C. D. Manning, and J. Heer. Termite: visualization techniques for assessing textual topic models. In Proceedings of the International Working Conference on Advanced Visual Interfaces, AVI ’12, page 74–77, New York, NY, USA, 2012. Association for Computing Machinery. W. Cui, Y. Wu, S. Liu, F. Wei, M. X. Zhou, and H. Qu. Context preserving dy- namic word cloud visualization. In 2010 IEEE Pacific Visualization Symposium (PacificVis), pages 121–128. IEEE, 2010. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. A. B. Dieng, F. J. Ruiz, and D. M. Blei. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics, 8:439–453, 2020. R. Egger and J. Yu. A topic modeling comparison between lda, nmf, top2vec, and bertopic to demystify twitter posts. Frontiers in sociology, 7:886498, 2022. L. Floridi and M. Chiriatti. Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines, 30:681–694, 2020. T. Griffiths, M. Jordan, J. Tenenbaum, and D. Blei. Hierarchical topic models and the nested chinese restaurant process. Advances in neural information processing systems, 16, 2003. M. Grootendorst. Keybert: Minimal keyword extraction with bert., 2020. M. Grootendorst. Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv preprint arXiv:2203.05794, 2022. A. Krishnan. Exploring the power of topic modeling techniques in analyzing customer reviews: a comparative analysis. arXiv preprint arXiv:2308.11520, 2023. J. H. Lau, D. Newman, and T. Baldwin. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In S. Wintner, S. Goldwater, and S. Riezler, editors, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 530–539, Gothenburg, Sweden, Apr. 2014. Association for Computational Linguistics. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. Advances in neural information processing systems, 13, 2000. D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. nature, 401(6755):788–791, 1999. X. Liu, Y. Zheng, Z. Du, M. Ding, Y. Qian, Z. Yang, and J. Tang. Gpt understands, too. AI Open, 2023. L. McInnes, J. Healy, and J. Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018. Y. Meng, Y. Zhang, J. Huang, Y. Zhang, and J. Han. Topic discovery via latent space clustering of pretrained language model representations. In Proceedings of the ACM web conference 2022, pages 3143–3152, 2022. B. Ogunleye, T. Maswera, L. Hirsch, J. Gaudoin, and T. Brunsdon. Comparison of topic modelling approaches in the banking context. Applied Sciences, 13(2):797, 2023. G. Papadia, M. Pacella, M. Perrone, and V. Giliberti. A comparison of different topic modeling methods through a real case study of italian customer care. Algorithms, 16(2):94, 2023. N. Reimers and I. Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019. M. Röder, A. Both, and A. Hinneburg. Exploring the space of topic coherence measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM ’15, page 399–408, New York, NY, USA, 2015. Association for Computing Machinery. S. Terragni and E. Fersini. OCTIS 2.0: Optimizing and Comparing Topic Models in Italian Is Even Simpler! In E. Fersini, M. Passarotti, and V. Patti, edi- tors, Proceedings of the Eighth Italian Conference on Computational Linguistics, CLiC-it 2021, Milan, Italy, January 26-28, 2022, volume 3033 of CEUR Workshop Proceedings. CEUR-WS.org, 2021. S. Terragni, E. Fersini, B. G. Galuzzi, P. Tropeano, and A. Candelieri. OCTIS: Com-paring and optimizing topic models is simple! In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 263–270. Association for Computational Linguistics, Apr. 2021. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96076 | - |
| dc.description.abstract | 在大數據時代,網路的普及導致了用戶生成內容的激增,包括應用程式評論。這種數據的激增既帶來了機會也帶來了挑戰,用於提取有意義的見解。本研究旨在利用主題模型技術,將大量的應用程式評論數據轉化為可操作的見解。通過應用主題模型方法,我們試圖識別評論中的潛在主題和趨勢。我們使用以 BERTopic 為基礎的方法,並比較其他主題模型技術來分析評論數據。我們的數據集來自熱門應用 TikTok 的評論,這些評論提供了多樣且全面的用戶意見和反饋。為了評估主題模型的性能,我們採用了多種評估指標,包括連貫性、多樣性和可解釋性,以確保模型生成有意義且有用的主題。我們將各種不同的結果以多元的可視化技術來展現,有助於有效地傳達從數據中獲得的見解,使得結果更容易理解和解釋。 | zh_TW |
| dc.description.abstract | In the era of big data, the proliferation of the internet has resulted in an overwhelming amount of user-generated content, including app reviews. This surge in data presents both opportunities and challenges for extracting meaningful insights. This study aims to harness the power of topic modeling techniques to transform vast amounts of app review data into actionable insights. By applying the topic modeling method, we seek to identify underlying themes and trends within the reviews. We utilized methods based on BERTopic, including comparisons with other topic modeling techniques, to analyze the review data. Our dataset consists of reviews from the popular app TikTok, which provides a diverse and comprehensive collection of user opinions and feedback. To assess the performance of the topic models, we employed several evaluation metrics. These included measures of coherence, diversity, and interpretability to ensure the models generated meaningful and useful topics. We represented the various results using diverse visualization techniques, which help in effectively communicating the insights derived from the data, making it easier to understand and interpret the findings. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-10-14T16:04:19Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-10-14T16:04:20Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Acknowledgements i
摘要 ii Abstract iii Contents v List of Figures vii List of Tables viii Chapter 1 Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 2 Related Work 5 2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 3 Methodology 8 3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Topic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Evaluation Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.1 Coherence (Cv) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.2 Topic Diversity (TD) . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.3 Interpretability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 4 Result 14 4.1 Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2 Hyperparameter Tuning . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2.1 Topic Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.2 UMAP Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Topic Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.1 KeyBERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.2 MMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Chapter 5 Conclusion 30 References 32 | - |
| dc.language.iso | en | - |
| dc.title | 使用主題模型技術探索應用程式評論 | zh_TW |
| dc.title | Exploring App Reviews Using Topic Modeling Techniques | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-1 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 傅楸善;盧瑞山;李逸元;林正偉 | zh_TW |
| dc.contributor.oralexamcommittee | Chiou-Shann Fuh;Ruei-Shan Lu;Yi-Yuan Lee;Jeng-Wei Lin | en |
| dc.subject.keyword | 主題模型,BERTopic,主題提取,自然語言處理 (NLP),應用程式評論, | zh_TW |
| dc.subject.keyword | topic model,BERTopic,topic extraction,natural language processing(NLP),App Reviews, | en |
| dc.relation.page | 35 | - |
| dc.identifier.doi | 10.6342/NTU202402739 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2024-10-01 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-1.pdf 未授權公開取用 | 1.53 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
