Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98918
Title: 動態環境中基於常識及物體可供性的任務規劃
ADAPT: Commonsense and Affordance-aware Planning in Dynamic Environments
Authors: 陳姵安
Pei-An Chen
Advisor: 徐宏民
Winston H. Hsu
Keyword: 具身人工智慧,移動式操控,物體可供性任務規劃,多模態情境學習,
Embodied AI,Mobile Manipulation,Affordance-Aware Task Planning,Multimodal In-Context Learning,
Publication Year : 2025
Degree: 碩士
Abstract: 具身機器人不應只是單純地依照指令執行,因為真實世界的環境常常充滿了意料之外的情況與例外。然而,現有的方法多半僅著重於直接執行指令,卻忽略了目標物件是否實際可操作,亦即缺乏對可用性的判斷能力。為了解決這一限制,我們提出 ADAPT,一個實用的基準測試(benchmark),用以挑戰機器人超越單純指令執行的能力。

在 ADAPT 中,物件可能處於不可使用的狀態,例如正在被使用或是髒污,且這些資訊並未明確地在指令中指出。機器人必須能夠感知環境、辨識物件狀態,並規劃出使物件回復可用狀態的行動,例如在將髒叉子放進杯子前先將其清潔。

為了實現這項能力,我們提出一個具備模組化設計、可獨立整合的架構,稱為可用性推理與動作感知記憶模組(Affordance Reasoner and Action-aware Memory, ARAM)。該模組能幫助機器人推理物件的可用性,並應用常識型規劃來偵測不可用狀態並產生替代方案。

當 ARAM 整合至 FILM 系統中時,在任務成功率上可達 70% 的相對提升,在目標完成度上可提升 33%,顯示其在動態環境中能顯著增強任務完成表現與規劃效率。
Intelligent embodied agents should not simply follow instructions, as real-world environments often involve unexpected conditions and exceptions. However, existing methods usually focus on directly executing instructions, without considering whether the target objects can actually be manipulated, meaning they lack the ability to assess available affordances. To address this limitation, we introduce ADAPT, a practical benchmark that challenges agents to go beyond simple instruction-following. In ADAPT, objects may be in unusable states, such as being in use or dirty, without this information being specified in the instructions. Agents must perceive the environment, recognize the object’s condition, and plan actions to restore the object to a usable state, such as cleaning a dirty fork before placing it into a cup. To enable this capability, we propose a plug-and-play module called Affordance Reasoner and Action-aware Memory (ARAM). The module helps the agent reason about object affordances and apply commonsense planning to detect unusable states and generate alternative solutions. When integrated with FILM, ARAM achieves up to 70% relative improvement in task success and 33% in goal completion, demonstrating its ability to enhance both task completion and planning efficiency in dynamic environments.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98918
DOI: 10.6342/NTU202503907
Fulltext Rights: 同意授權(限校園內公開)
metadata.dc.date.embargo-lift: 2025-08-26
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-113-2.pdf
Access limited in NTU ip range
4.67 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved