Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92803| Title: | 透過電影劇本中的隱喻探索大型語言模型敘事推理的極限 Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses |
| Authors: | 許雅晴 Ya-Ching Hsu |
| Advisor: | 徐宏民 Winston H. Hsu |
| Keyword: | 隱喻,語言模型,自然語言處理,模型分析與可解釋性,探索,推理, trope,language models,NLP,model analysis and interpretability,probing,reasoning, |
| Publication Year : | 2024 |
| Degree: | 碩士 |
| Abstract: | 大型語言模型(LLMs)在思維鏈(CoT)提示下,已展示出在數學、常識和邏輯等事實內容中的顯著多步推理能力。然而,它們在需要更高抽象能力的敘事推理中的表現仍未被探討。本研究利用電影劇本中的隱喻來評估最先進LLMs的敘事推理能力,並發現其表現不佳。我們引入 trope-wise querying 的方法來應對這些挑戰,將F1分數提高了11.8分。此外,雖然先前的研究表明CoT可以增強多步推理能力,但本研究顯示CoT在敘事內容中可能會導致幻覺,降低GPT-4的表現。我們還引入了一種對抗性注入方法,將與隱喻相關的文本標記嵌入不包含明確隱喻的電影劇本中,揭示了CoT對此類注入的高度敏感性。我們的綜合分析為未來的研究方向提供了見解。 Large language models (LLMs) equipped with chain-of-thoughts (CoT) prompting have shown significant multi-step reasoning capabilities in factual content like mathematics, commonsense, and logic. However, their performance in narrative reasoning, which demands greater abstraction capabilities, remains unexplored. This study utilizes tropes in movie synopses to assess the narrative reasoning abilities of state-of-the-art LLMs and uncovers their low performance. We introduce a trope-wise querying approach to address these challenges and boost the F1 score by 11.8 points. Moreover, while prior studies suggest that CoT enhances multi-step reasoning, this study shows CoT can cause hallucinations in narrative content, reducing GPT-4's performance. We also introduce an Adversarial Injection method to embed trope-related text tokens into movie synopses without explicit tropes, revealing CoT's heightened sensitivity to such injections. Our comprehensive analysis provides insights for future research directions. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92803 |
| DOI: | 10.6342/NTU202401188 |
| Fulltext Rights: | 未授權 |
| Appears in Collections: | 資訊工程學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-112-2.pdf Restricted Access | 5.85 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
