請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99257完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 謝舒凱 | zh_TW |
| dc.contributor.advisor | Shu-Kai Hsieh | en |
| dc.contributor.author | 戴宓 | zh_TW |
| dc.contributor.author | Deborah Watty | en |
| dc.date.accessioned | 2025-08-21T17:00:51Z | - |
| dc.date.available | 2025-08-22 | - |
| dc.date.copyright | 2025-08-21 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-07-24 | - |
| dc.identifier.citation | Alessio Buscemi and Daniele Proverbio. ChatGPT vs Gemini vs LLaMA on multilingual sentiment analysis, 2024. URL https://arxiv.org/abs/2402.01715.
William A Callahan. Dreaming as a critical discourse of national belonging: China dream, American dream and world dream. Nations and Nationalism, 23(2):248–270, 2017. URL https://doi.org/10.1111/nana.12296. Jaime Carbonell and Jade Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’98), pages 335–336, Melbourne, Australia, 1998. Association for Computing Machinery. URL https://doi.org/10.1145/290941.291025. Julian TszKin Chan and Weifeng Zhong. Reading China: Predicting policy change with machine learning, 2019. URL https://ssrn.com/abstract=3275687. Harrison Chase. LangChain, 2022. URL https://github.com/langchain-ai/langchain. Vinay Chaudhri, Chaitanya Baru, Naren Chittar, Xin Dong, Michael Genesereth, James Hendler, Aditya Kalyanpur, Douglas Lenat, Juan Sequeda, Denny Vrandečić, et al. Knowledge graphs: introduction, history and, perspectives. AI Magazine, 43(1):17–29, 2022. Nadezhda Chirkova, David Rau, Hervé Déjean, Thibault Formal, Stéphane Clinchant, and Vassilina Nikoulina. Retrieval-augmented generation in multilingual settings. In Proceedings of the 1st Workshop on Towards Knowledgeable Language Models (KnowLLM 2024), pages 177–188, Bangkok, Thailand, 2024. Association for Computational Linguistics. URL https://aclanthology.org/2024.knowllm-1.15. Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. Palm: Scaling language modeling with pathways, 2022. URL https://arxiv.org/abs/2204.02311. Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. From local to global: A graph RAG approach to query-focused summarization, 2024. URL https://arxiv.org/abs/2404.16130. Shahul Es, Jithin James, Luis Espinosa-Anke, and Steven Schockaert. RAGAS: Automated evaluation of retrieval augmented generation, 2023. URL https://arxiv.org/abs/2309.15217. Yew-Jin Fang. Reporting the same events? A critical analysis of Chinese print news media texts. Discourse & Society, 12(5):585–613, 2001. URL https://doi.org/10.1177/0957926501012005002. Luyu Gao, Xueguang Ma, Jimmy Lin, and Jamie Callan. Precise zero-shot dense retrieval without relevance labels, 2022. URL https://arxiv.org/abs/2212.10496. Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey, 2024. URL https://arxiv.org/abs/2312.10997. Jiayan Guo, Lun Du, Hengyu Liu, Mengyu Zhou, Xinyi He, and Shi Han. GPT4Graph: Can large language models understand graph structured data ? An empirical evaluation and benchmarking, 2023. URL https://arxiv.org/abs/2305.15066. Prerana Sanjay Kulkarni, Muskaan Jain, Disha Sheshanarayana, and Srinivasan Parthiban. Hecix: Integrating knowledge graphs and large language models for biomedical research, 2024. URL https://arxiv.org/abs/2407.14030. Maria Kunilovskaya, Koel Dutta Chowdhury, Heike Przybyl, Cristina España-Bonet, and Josef Genabith. Mitigating translationese with GPT-4: Strategies and performance. In Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 411–430, Sheffield, UK, 2024. European Association for Machine Translation (EAMT). URL https://aclanthology.org/2024.eamt-1.35. Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, 2019. URL https://arxiv.org/abs/1910.13461. Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020. Jiwei Li and Eduard Hovy. Sentiment analysis on the People’s Daily. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 467–476, Doha, Qatar, 2014. Association for Computational Linguistics. URL https://aclanthology.org/D14-1053. Cristina Jayme Montiel, Alma Maria O. Salvador, Daisy C. See, and Marlene M. De Leon. Nationalism in local media during international conflict: Text mining domestic news reports of the China–Philippines maritime dispute. Journal of Language and Social Psychology, 33(5):445–464, 2014. URL https://doi.org/10.1177/0261927X14542435. Preksha Nema and Mitesh M. Khapra. Towards a better metric for evaluating question generation systems. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3950–3959, Brussels, Belgium, 2018. Association for Computational Linguistics. URL https://aclanthology.org/D18-1429. Shaochang Qian. People’s Daily and China Daily: a comparative study. Gazette, 40(1):57–68, 1987. URL https://doi.org/10.1177/001654928704000104. Yufang Qian. Discursive constructions around terrorism in the People’s Daily (China) and The Sun (UK) before and after 9.11: A corpus-based contrastive critical discourse analysis, volume 23. Peter Lang, 2010. Egil Rønningstad, Erik Velldal, and Lilja Øvrelid. Entity-level sentiment analysis (ELSA): An exploratory task survey. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6773–6783, Gyeongju, Republic of Korea, 2022. International Committee on Computational Linguistics. URL https://aclanthology.org/2022.coling-1.589/. Egil Rønningstad, Erik Velldal, and Lilja Øvrelid. A GPT among annotators: LLM-based entity-level sentiment annotation. In Proceedings of the 18th Linguistic Annotation Workshop (LAW-XVIII), pages 133–139, St. Julians, Malta, 2024. Association for Computational Linguistics. URL https://aclanthology.org/2024.law-1.13/. Skipper Seabold and Josef Perktold. Statsmodels: econometric and statistical modeling with Python. SciPy, 7(1):92–96, 2010. Sean J. Taylor and Benjamin Letham. Forecasting at scale. The American Statistician, 72 (1):37–45, 2018. URL https://doi.org/10.1080/00031305.2017.1380080. James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. FEVER: a large-scale dataset for Fact Extraction and VERification. In Marilyn Walker, Heng Ji, and Amanda Stent, editors, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 809–819, New Orleans, Louisiana, 2018. Association for Computational Linguistics. URL https://aclanthology.org/N18-1074. Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. From Louvain to Leiden: guaranteeing well-connected communities. Scientific reports, 9(1):1–12, 2019. URL https://doi.org/10.1038/s41598-019-41695-z. Denny Vrandečić and Markus Krötzsch. Wikidata: a free collaborative knowledgebase. Commun. ACM, 57(10):78– 85, 2014. URL https://doi.org/10.1145/2629489. Zengzhi Wang, Qiming Xie, Yi Feng, Zixiang Ding, Zinong Yang, and Rui Xia. Is ChatGPT a good sentiment analyzer? A preliminary study, 2024. URL https://arxiv.org/abs/2304.04339. Jianhao Yan, Pingchuan Yan, Yulong Chen, Judy Li, Xianchao Zhu, and Yue Zhang. GPT-4 vs. human translators: A comprehensive evaluation of translation quality across languages, domains, and expertise levels, 2024. URL https://arxiv.org/abs/2407.03658. Li Yang, Zengzhi Wang, Ziyan Li, Jin-Cheon Na, and Jianfei Yu. An empirical study of multimodal entity-based sentiment analysis with ChatGPT: Improving in-context learning via entity-aware contrastive learning. Information Processing & Management, 61(4):103724, 2024. URL https://www.sciencedirect.com/science/article/pii/S0306457324000840. Xiaoming Zhang, Ming Wang, Xiaocui Yang, Daling Wang, Shi Feng, and Yifei Zhang. Hierarchical retrieval-augmented generation model with rethink for multi-hop question answering, 2024. URL https://arxiv.org/abs/2408.11875. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99257 | - |
| dc.description.abstract | 在日益緊密相連的世界中,跨越語言邊界理解事件如何被報導是一項重要的挑戰。本論文以《人民日報》(中國共產黨的官方報章)為例,探討大型語言模型(LLMs)對多語言意見挖掘之使用。雖然基於LLM的方法在近年來已被證實對意見挖掘任務非常有效,但其在多語言任務應用方面的研究仍然相對有限。
本論文的目標是評估基於LLM的方法在單語和多語語境下進行意見挖掘與問答的表現,並檢測內文與提示語言不相符是否會影響結果。第一個實驗聚焦於使用零樣本提示詞識別實體層級的情感,並比較中文、英文及德文的提示詞結果表現。然而,儘管情感分析提供了重要的洞察,卻無法提供關於文本內容的有效資訊。為了彌補此一缺陷,第二個探索實驗使用檢索增強生成(RAG)進行問答,並比較了三種不同架構在不同問題類型上的表現。 在這兩個實驗中,多語言LLM(如GPT-4、Gemini)均表現出穩健的性能,即使在數據資料與查詢語言不一致的情況下,表現差異也非常微小。零樣本提示在情感分析中展現了強大的潛力,對「日本」一詞的可視化情感分析反映了中日關係在一些重要事件期間的預期變化。而在問答任務中,RAG架構的選擇則對性能表現影響顯著,不同架構在處理不同類型的問題時各有所長,突顯了根據任務調整方法的必要性。 這些研究結果表明了基於LLM的方法在多語言任務中的靈活性,即便數據資料和查詢語言有所不同,仍能為情感分析和問答任務提供有效的解決方案。 | zh_TW |
| dc.description.abstract | Understanding how events are reported across linguistic boundaries is a significant challenge in an increasingly interconnected world. This thesis explores the use of large language models (LLMs) for multilingual opinion mining, using The People's Daily, the official newspaper of the Communist Party of China, as a sample use case. While LLM-based methods have proven to be highly effective for opinion mining tasks in recent years, there is still relatively little research on their application to multilingual tasks.
The overall goal of this thesis was to assess the performance of LLM-based methods for opinion mining and question answering in both monolingual and multilingual contexts, evaluating whether mismatches between content and prompt languages impact outcomes. The first experiment focused on identifying entity-level sentiment using zero-shot prompting, comparing the performance of Chinese, English and German prompts. Although sentiment analysis provides valuable insights, it offers no information about the content of the texts. To address this gap, the second experiment explored question answering using Retrieval-Augmented Generation (RAG), comparing the performance of three different architectures across different question types. Across both experiments, multilingual LLMs, such as GPT-4 and Gemini, showed robust performance, with minimal differences observed when data and query languages did not match. Zero-shot prompting demonstrated strong potential for sentiment analysis, with visualizations of sentiment toward Japan revealing expected shifts during key events in Sino-Japanese relations. For question answering, the choice of RAG architecture significantly influenced performance, with different architectures excelling at different types of questions, underscoring the need to tailor the approach to the task. These findings underscore the versatility of LLM-based methods for multilingual tasks, offering effective solutions for sentiment analysis and question answering, even in cases where data and queries are in different languages. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T17:00:51Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-08-21T17:00:51Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | ## Acknowledgements ..................................................... i
## 摘要.................................................................................... iii ## Abstract ......................................................................... v ## Contents ................................................................. vii ## List of Figures ............................................................. xi ## List of Tables .............................................................. xiii ## Chapter 1 Introduction ........................................................................... 1 - 1.1 Motivation .................................................................................. 1 - 1.2 Dataset ...................................................................................... 2 - 1.3 Goals and Research Questions ....................................................... 3 - 1.4 Structure of the Thesis ............................................................... 4 ## Chapter 2 Literature Review ................................................................. 5 - 2.1 Previous Studies on the People’s Daily .............................................. 5 - 2.2 Sentiment Analysis with LLMs ...................................................... 7 - 2.2.1 Multilingual Studies ............................................................... 9 - 2.3 Retrieval-Augmented Generation .................................................. 10 - 2.3.1 Architectures Using Knowledge Graphs .................................. 11 - 2.3.2 Multilingual Applications ..................................................... 13 - 2.3.3 Evaluation Methods and Criteria ............................................. 13 ## Chapter 3 Sentiment Analysis Experiment ............................................. 17 - 3.1 Test Set Creation ..................................................................... 18 - 3.2 Human Annotation ................................................................. 19 - 3.2.1 Questionnaire ..................................................................... 19 - 3.2.2 Participants ....................................................................... 19 - 3.2.3 Inter-Annotator Agreement ................................................... 20 - 3.3 Manual Curation of a Gold Standard ............................................. 22 - 3.4 Annotation With LLMs ............................................................. 23 - 3.4.1 Implementation .................................................................. 23 - 3.4.2 Results .............................................................................. 25 - 3.4.3 Application to a Large Dataset ............................................. 27 - 3.5 Summary of Findings ............................................................... 33 - 3.6 Sentiment Analysis Tool .......................................................... 34 ## Chapter 4 Retrieval-Augmented Generation Experiment ............................. 37 - 4.1 Dataset ................................................................................... 39 - 4.1.1 Translation into English ....................................................... 40 - 4.2 Architectures ......................................................................... 40 - 4.2.1 VectorRAG ....................................................................... 40 - 4.2.2 CypherRAG ...................................................................... 42 - 4.2.3 GraphRAG ....................................................................... 45 - 4.3 Evaluation ............................................................................. 46 - 4.3.1 Question Generation .......................................................... 47 - 4.3.2 Scoring Criteria ................................................................. 48 - 4.4 Performance of Different Architectures Across Question Types ............. 50 - 4.5 Analysis of Patterns in Answers and Mistakes .................................. 55 - 4.5.1 VectorRAG ....................................................................... 55 - 4.5.2 CypherRAG ...................................................................... 57 - 4.5.3 GraphRAG ....................................................................... 64 - 4.6 Additional Analysis: Language Bias in LLM-Based Evaluation ............ 65 - 4.7 Summary of Findings ............................................................... 67 - 4.8 Final Tools ............................................................................ 68 ## Chapter 5 Discussion ........................................................................... 69 - 5.1 The Case for Manual Evaluation ................................................... 69 - 5.2 Limitations ............................................................................. 73 - 5.2.1 Sentiment Analysis ............................................................ 73 - 5.2.2 RAG Evaluation ................................................................. 74 - 5.3 Potential Improvements to RAG Architectures ................................. 78 - 5.4 Potential Future Research Directions ............................................ 80 - 5.5 Conclusion ............................................................................. 81 ## References ....................................................................................... 83 ## Appendix A — Supplementary Materials for Sentiment Analysis Experiment ... 89 - A.1 Participant Instructions ............................................................. 89 - A.2 Sentiment Analysis Prompts ...................................................... 91 - A.3 Entity Name Translation Prompt .................................................. 95 - A.4 Additional Plot ........................................................................ 96 ## Appendix B — Supplementary Materials for RAG Experiment ....................... 97 - B.1 Translation of Articles into English ............................................ 97 - B.1.1 Translation Prompts ........................................................... 97 - B.1.2 Translation of Remaining Chinese Characters ......................... 98 - B.2 Graph Schema for CypherRAG ..................................................... 99 - B.3 Additional RAG Prompts .......................................................... 102 - B.3.1 VectorRAG ....................................................................... 102 - B.3.2 CypherRAG ...................................................................... 103 - B.4 Test Question Generation Prompts .............................................. 104 - B.4.1 Detail Questions ................................................................. 104 - B.4.2 Big Picture Questions ......................................................... 104 - B.5 Evaluation Prompts ................................................................. 105 | - |
| dc.language.iso | en | - |
| dc.subject | 大型語言模型 | zh_TW |
| dc.subject | 多語言情緒分析 | zh_TW |
| dc.subject | 檢索增強生成 | zh_TW |
| dc.subject | 知識圖譜 | zh_TW |
| dc.subject | Knowledge Graphs | en |
| dc.subject | LLMs | en |
| dc.subject | Multilingual Sentiment Analysis | en |
| dc.subject | RAG | en |
| dc.title | 基於大型語言模型的外語報刊數據意見挖掘 | zh_TW |
| dc.title | Exploring LLM-Based Opinion Mining in Foreign-Language Newspaper Data | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 鄧志松;張瑜芸 | zh_TW |
| dc.contributor.oralexamcommittee | Chih-Sung Teng;Yu-Yun Chang | en |
| dc.subject.keyword | 大型語言模型,多語言情緒分析,檢索增強生成,知識圖譜, | zh_TW |
| dc.subject.keyword | LLMs,Multilingual Sentiment Analysis,RAG,Knowledge Graphs, | en |
| dc.relation.page | 108 | - |
| dc.identifier.doi | 10.6342/NTU202502354 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2025-07-28 | - |
| dc.contributor.author-college | 文學院 | - |
| dc.contributor.author-dept | 語言學研究所 | - |
| dc.date.embargo-lift | 2025-08-22 | - |
| 顯示於系所單位: | 語言學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 9.62 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
