MENSA: 利用心理模擬進行大型語言模型代理的動態經驗檢索

張仲喆; Chung-Che Chang

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97359

Title:	MENSA: 利用心理模擬進行大型語言模型代理的動態經驗檢索 MENSA: Leveraging Mental Simulation for Dynamic Experience Retrieval in LLM Agents
Authors:	張仲喆 Chung-Che Chang
Advisor:	許永真 Yung-Jen Hsu
Co-Advisor:	傅立成 Li-Chen Fu
Keyword:	大語言模型,大語言模型代理人,心智模擬,提示工程,少樣本上下文學習, Large Language Model,LLM-based Agent,Mental Simulation,Prompt Engineering,Few-shot In-context Learning,
Publication Year :	2024
Degree:	碩士
Abstract:	當前以大型語言模型（LLM）驅動的 Agent 展現了在互動文本環境中執行序列決策任務的潛力。然而，由於這些 LLM Agent 難以意識到執行某個動作所需的前提條件，因此在執行任務的關鍵步驟時經常失敗。與此不同，人類則通過心理模擬，藉由曾獲得的經驗想像動作及其後果的過程，來判別能滿足前置條件所需的動作。為了解決這一問題，我們提出了一種名為MENtal Simulation Agent（MENSA）的方法。MENSA 利用 LLM 生成未來的「動作-觀察對」以達到進行預測的效果，並根據對未來的預測查找過去相關的經驗提供給 Agent，從而在不進行模型參數微調的情況下提升 LLM Agent 的性能表現。我們在互動文本環境ScienceWorld 中評估了我們提出的方法，結果顯示，MENSA 不僅在使用較大的模型（如 GPT-4o-mini）時，相比之前的最先進方法提高了 15.8 分（29%），並且在不同尺寸 LLM 中，包括較小的模型（如 Phi-3-mini）在內，也一樣獲得性能的提升，進步幅度達到 11.9 分（57.5%）。 Large language models (LLMs) powered agents have shown the potential to perform sequential decision-making tasks in interactive text environments. However, these LLM agents often fail at executing the key steps to achieve tasks because they are not aware of the preconditions required for an action. Unlike these agents, humans, on the other hand, use mental simulation, a process of imagining actions and their consequences from experience, to identify the actions needed to satisfy the preconditions. To address this, we propose a MENtal Simulation Agent (MENSA). MENSA leverages LLMs to generate a forecast of action-observation pairs for future time steps. Based on the forecast, it then retrieves relevant past experiences to improve the performance of the LLM agent without fine-tuning. We evaluate our method in the interactive text environment ScienceWorld to show that MENSA not only outperforms previous state-of-the-art by +15.8 points (29%) when using a larger model (e.g., GPT-4o-mini) but also consistently improves the performance across different sizes of LLMs including smaller ones such as Phi-3-mini with improvement of +11.9 points (57.5%).
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97359
DOI:	10.6342/NTU202404517
Fulltext Rights:	同意授權(全球公開)
metadata.dc.date.embargo-lift:	2025-05-08
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-113-2.pdf	1.96 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets