Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94212
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor鄭卜壬zh_TW
dc.contributor.advisorPu-Jen Chengen
dc.contributor.author董光立zh_TW
dc.contributor.authorGuang-Li Dongen
dc.date.accessioned2024-08-15T16:15:01Z-
dc.date.available2024-10-05-
dc.date.copyright2024-08-15-
dc.date.issued2024-
dc.date.submitted2024-08-08-
dc.identifier.citationOffice of the President (Taiwan), "Inaugural address of ROC 14th-term President Tsai Ing-wen," 2016. [Online]. Available: https://www.president.gov.tw/news/20444.
The Central News Agency, "World Press Freedom Index ranking: Taiwan rises to 27th and China ranks 9th from bottom," 2024.
Reporters Without Borders, "2024 World Press Freedom Index ranking," 2024. [Online]. Available: https://rsf.org/en/country/taiwan.
Ralf Krestel, Sabine Bergler and Rene Witte, "Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles," in LREC, 2008.
David K. Elson and Kathleen R. McKeown, "Automatic Attribution of Quoted Speech in Literary Narrative," in AAAI, 2010.
Chris Newell, Tim Cowlishaw and David Man, "Quote Extraction and Analysis for News," in KDD, 2018.
Timoté Vaucher, Andreas Spitz, Michele Catasta and Robert West, "Quotebank: A Corpus of Quotations from a Decade of News," in WSDM, 2021.
Kuan-Lin Lee, Yu-Chung Cheng, Pai-Lin Chen and Hen-Hsen Huang, "Keeping Their Words: Direct and Indirect Chinese Quote Attribution from Newspapers," in WWW, 2020.
Xiaoxiao Shang, Zhiyuan Peng, Qiming Yuan, Sabiq Khan, Lauren Xie, Yi Fang and Subramaniam Vincent, "DIANES: A DEI Audit Toolkit for News Sources," in SIGIR, 2022.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kapla, "Language Models are Few-Shot Learners.," NeurIPS, 2020.
Kristina Toutanova and Jacob Devlin, Ming-Wei Chang, Kenton Lee, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.," in NAACL, 2019.
Shen, S., Dong, Z., Ye, J., Ma, L., Huang, Y., & L, "Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT.," AAAI , 2020.
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang and Lu Wang, Weizhu Chen, "LoRA: Low-Rank Adaptation of Large Language Models.," ICLR, 2022.
JasonWei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le and Denny Zhou, "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models," in NeurIPS, 2022.
Guidance, "Guidance: A language for controlling large language models.," [Online]. Available: https://github.com/guidance-ai/guidance.
M. A. Hearst, "Automatic Acquisition of Hyponyms from Large Text Corpora.," in COLING, 1992.
Lafferty, J., McCallum, A., & Pereira, F., "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data.," in ICML, 2001.
S. Sarawagi, "Information Extraction," in Foundations and Trends in Databases, 2008.
Lample, G., Ballesteros, M., Subramanian, S. and Kawakami, K., & Dyer, C., "Neural Architectures for Named Entity Recognition.," in NAACL, 2016.
Monica Agrawal, Stefan Hegselmann and Hunter Lang, "Large language models are few-shot clinical information extractors," in EMNLP, 2022.
Minqing Hu, Bing Liu, "Mining and Summarizing Customer Reviews," in SIGKDD, 2004.
Bo Pang, Lillian Lee and Shivakumar Vaithyanathan, "Thumbs up?: Sentiment Classification Using Machine Learning Techniques.," in EMNLP, 2002.
Peter D. Turney, "Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews.," in ACL, 2002.
Yoon Kim, "Convolutional Neural Networks for Sentence Classification," in EMNLP, 2014.
Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi and Luke Zettlemoyer, "Rethinking the Role of Demonstrations: What Makes In-Context LearningWork?," in EMNLP, 2022.
Wenxuan Zhang, Yue Deng, Bing Liu, Sinno Jialin Pan and Lidong Bing, "Sentiment Analysis in the Era of Large Language Models: A Reality Check," in NAACL, 2024.
Central News Agency (CNA), "https://focustaiwan.tw/aboutus," [Online].
OpenAI, "GPT-3.5 Turbo fine-tuning and API updates," 22 8 2023. [Online]. Available: https://openai.com/index/gpt-3-5-turbo-fine-tuning-and-api-updates/.
NARLabs, "Llama3-TAIDE-LX-8B-Chat-Alpha1," [Online]. Available: https://huggingface.co/taide/Llama3-TAIDE-LX-8B-Chat-Alpha1.
National Applied Research Laboratories, "TAIDE," [Online]. Available: https://en.taide.tw/.
OpenAI, "dall-e-3," [Online]. Available: https://openai.com/index/dall-e-3/.
Nations United, "UNRWA Situation Report #1 on the Situation in the Gaza Strip," 7 10 2023. [Online]. Available: https://www.un.org/unispal/document/unrwa-situation-report-1-on-the-situation-in-the-gaza-strip/.
BBC, "US and China agree to resume military communications after summit," 2024. [Online]. Available: https://www.bbc.com/news/world-us-canada-67411191.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94212-
dc.description.abstract新聞意見探勘旨在蒐集與處理新聞中各受訪者與組織發布的訊息,進而客觀地綜整並分析各立場的異同。為了實現更多元、平等、透明且完整的意見分析結果,我們設計並實作了一套基於可控制大型語言模型的新聞意見萃取與分析系統,具有以下三項主要效益:
透明過程:將多筆非結構化資訊經由結構化表格轉換,使系統不僅顯示最終的摘要結果,還能透明化整體運算過程,方便專家檢視、分析、介入與調整作業流程。
設計簡化:利用大型語言模型連結多個自然語言處理任務,簡化分析系統程式的演算法邏輯設計,提升整體運算效率。
高效抽取:通過上下文學習、模型微調及引導語言模型的高效程式範本,讓小型模型在引述抽取任務中也能達到與SOTA大型語言模型相媲美的效能。
這套系統不僅提升了新聞意見萃取與分析的準確性和效率,還為專家提供了更直觀和可操作的分析工具,為新聞領域帶來更全面和客觀的意見分析成果。
zh_TW
dc.description.abstractNews opinion mining aims to collect and process information released by interviewees and organizations in the news, objectively summarizing and analyzing the differences and similarities in various stances. To achieve more diverse, equal, transparent, and comprehensive opinion analysis results, we have designed and implemented a news opinion extraction and analysis system based on controllable large language models (LLMs), which offers the following three main benefits:
Transparent Process: By converting multiple pieces of unstructured information into structured tables, the system not only displays the final summarized results but also transparently reveals the entire computational process, making it easier for experts to review, analyze, intervene, and adjust the workflow.
Simplified Design: Utilizing LLMs to connect multiple natural language processing tasks simplifies the algorithmic logic design of the analysis system, improving overall computational efficiency.
Efficient Extraction: Through contextual learning, model fine-tuning, and efficient programming templates for guiding language models, even small models can achieve performance comparable to state-of-the-art (SOTA) LLMs in citation extraction tasks.
This system not only enhances the accuracy and efficiency of news opinion extraction and analysis but also provides experts with more intuitive and operable analytical tools, bringing more comprehensive and objective opinion analysis results to the field of news.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-15T16:15:01Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-15T16:15:01Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents致謝 i
摘要 ii
Abstract iii
Contents iv
List of Figures vi
List of Tables vii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Workflow 2
1.3 Thesis Structure 5
Chapter 2 Background and Related Work 7
2.1 Quote Extraction 7
2.2 Large language models 9
2.3 Information Extraction 11
2.4 Sentiment Analysis 12
Chapter 3 Quote Extraction and Experiment Result 15
3.1 Quotation Types 15
3.2 Data Source and Crawler 16
3.3 JSON Format 18
3.4 Supervised Fine-Tuning 19
3.5 Programming Paradigm 22
3.6 Experiment Result 26
Chapter 4 News Opinion Analysis System 29
4.1 Workflow 29
4.2 Title and Name Extraction 30
4.3 Three Classification Methods 31
4.4 Five Categories 34
4.5 Name and Title Classification 36
4.6 Aspect Extraction 37
4.7 Sentiment Classification 39
4.9 Aspect Summarize 40
Chapter 5 User Interface and Demo 41
5.1 Index 41
5.2 Israel–Hamas War 42
5.3 Biden-Xi Meeting 46
Chapter 6 Conclusion and Future Work 50
6.1 Conclusion 50
6.2 Future Work 51
References 53
-
dc.language.isoen-
dc.title設計與實作基於可控制大型語言模型的新聞意見萃取與分析系統zh_TW
dc.titleDesign and Implementation of a News Opinion Extraction and Analysis System Based on Controllable Large Language Modelsen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee邱志義;魏志達zh_TW
dc.contributor.oralexamcommitteeChih-Yi Chiu;Jyh-Da Weien
dc.subject.keyword引述抽取,意見探勘,資訊抽取,大型語言模型,新聞分析,zh_TW
dc.subject.keywordQuote Extraction,Sentiment Analysis,Information Extraction,Large Language Model,News Analysis,en
dc.relation.page55-
dc.identifier.doi10.6342/NTU202403256-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-08-10-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
dc.date.embargo-lift2027-08-10-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
  此日期後於網路公開 2027-08-10
6.71 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved