TALM：應用於可擴展程式碼生成的具長期記憶之樹狀多代理協作框架

沈明彤; Ming-Tung Shen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100601

標題:	TALM：應用於可擴展程式碼生成的具長期記憶之樹狀多代理協作框架 TALM: Tree-Structured Multi-Agent Framework with Long-Term Memory for Scalable Code Generation
作者:	沈明彤 Ming-Tung Shen
指導教授:	莊裕澤 Yuh-Jzer Joung
出版年 :	2025
學位:	碩士
摘要:	程式碼生成任務被視為最具工程應用潛力的代理落地場域，對語意理解、邏輯推理與錯誤修正能力皆具高度要求。儘管大型語言模型（LLMs）在自然語言處理領域表現卓越，然而在面對結構複雜、需多階段推理的任務時，仍顯能力不足。為突破此侷限，過去研究陸續發展多代理架構，試圖透過多個代理間的互動協作來處理複雜推理任務，但往往受限於流程僵化、錯誤修正成本高與缺乏經驗累積等問題。為此，本研究提出 TALM（Tree-Structured Multi-Agent Framework with Long-Term Memory），一套結合結構化任務拆解、區域性重推理與長期記憶模組的多代理框架。TALM 採用可拓展的樹狀協作結構，配合 divide and conquer 的策略，提升應對不同任務規模時的推理彈性；並進一步利用樹狀架構明確的父子關係，在推理錯誤發生時，僅需針對特定子樹進行局部重推理，顯著提升錯誤修正效率。同時，本研究亦設計以向量資料庫為基礎的長期記憶模組，支援知識的語意查詢與整合，實現隱性的自我強化學習機制。實驗結果顯示，TALM 在 HumanEval、BigCodeBench 與 ClassEval 等程式碼生成基準資料集上皆表現優異，展現穩定的推理效能與良好的詞元效率，證實本研究所提框架在處理高複雜度任務時的實用性與推論價值。 Code generation presents a promising domain for deploying agents in real-world engineering, requiring strong semantic understanding, reasoning, and error correction. While LLMs excel in natural language tasks, they struggle with complex, multi-step reasoning. Prior multi-agent frameworks aim to address this via collaboration but often suffer from rigid workflows, high correction costs, and lack of memory accumulation. To address these challenges, we propose TALM (Tree-Structured Multi-Agent Framework with Long-Term Memory), a framework that integrates structured task decomposition, localized re-reasoning, and long-term memory. TALM adopts an extensible tree-based collaboration structure, which, combined with a divide-and-conquer strategy, enhances reasoning flexibility across varying task scopes. The clear parent-child relationships in the tree enable efficient subtree-level error correction when reasoning flaws occur. In addition, a long-term memory module built on a vector database supports semantic querying and integration of prior knowledge, enabling implicit self-improvement through experience reuse. Experimental results on HumanEval, BigCodeBench, and ClassEval benchmarks show that TALM consistently achieves strong reasoning performance and high token efficiency, demonstrating its practical value and robustness in solving complex code generation tasks.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100601
DOI:	10.6342/NTU202504045
全文授權:	未授權
電子全文公開日期:	N/A
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	1.41 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。