Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98384
Title: 利用合作轉換器來學習情境合作
Learning to Coordinate In-Context with Coordination Transformers
Authors: 王懷志
Huai-Chih Wang
Advisor: 孫紹華
Shao-Hua Sun
Keyword: 情境學習,情境強化學習,多智能體合作,人機合作,決策用轉換器,
In-Context Learning,In-Context Reinforcement Learning,Multi-Agent Coordination,Human-AI Collaboration,Transformers for Decision Making,
Publication Year : 2025
Degree: 碩士
Abstract: 在動態且具有不確定性的環境中實現人工智能體之間的有效協作,仍然是多智能體系統中的一大挑戰。現有方法,如自我對弈(self-play)與族群式訓練(population-based methods),要不是無法適應陌生合作對象的行為模式,就是需要極為大量的訓練次數。為了解決這些限制,我們提出了一種新的情境式協作框架——Coot(Coordination Transformers),該方法利用近期的互動歷史,能快速適應未見過的協作對象。與過往著重於增加訓練對象多樣性的做法不同,Coot 明確地聚焦於根據觀察到的夥伴行為進行動作預測,進而達到對新夥伴行為的適應。我們使用行為互補的智能體所產生的互動軌跡進行訓練,使得 Coot 能在無需額外監督或微調訓練的情況下,快速學會有效的協作策略。在 Overcooked 環境中的實驗顯示,Coot 在與陌生夥伴協作的任務中,明顯優於其他基準方法。人類評估也進一步證實 Coot 是最有效的協作夥伴;而大量的消融實驗則凸顯了其在多智能體場景中的韌性、靈活性與對情境的敏感度。
Effective coordination among artificial agents in dynamic and uncertain environments remains a significant challenge in multi-agent systems. Existing approaches, such as self-play and population-based methods, either generalize poorly to unseen partners or require impractically extensive training. To overcome these limitations, we propose Coordination Transformers (Coot), a novel in-context coordination framework that uses recent interaction histories to rapidly adapt to unseen partners. Unlike previous approaches that primarily aim to increase the diversity of training partners, Coot explicitly focuses on adapting to new partner behaviors by predicting actions aligned with observed partner interactions. Trained on interaction trajectories collected from diverse pairs of agents with complementary behaviors, \coot quickly learns effective coordination strategies without explicit supervision or fine-tuning. Evaluations on the Overcooked benchmark demonstrate that \coot significantly outperforms baseline methods in coordination tasks involving previously unseen partners. Human evaluations further confirm \coot as the most effective collaborative partner, while extensive ablations highlight its robustness, flexibility, and sensitivity to context in multi-agent scenarios.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98384
DOI: 10.6342/NTU202502134
Fulltext Rights: 同意授權(限校園內公開)
metadata.dc.date.embargo-lift: 2030-07-21
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
ntu-113-2.pdf
  Restricted Access
8.39 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved