請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98918| 標題: | 動態環境中基於常識及物體可供性的任務規劃 ADAPT: Commonsense and Affordance-aware Planning in Dynamic Environments |
| 作者: | 陳姵安 Pei-An Chen |
| 指導教授: | 徐宏民 Winston H. Hsu |
| 關鍵字: | 具身人工智慧,移動式操控,物體可供性任務規劃,多模態情境學習, Embodied AI,Mobile Manipulation,Affordance-Aware Task Planning,Multimodal In-Context Learning, |
| 出版年 : | 2025 |
| 學位: | 碩士 |
| 摘要: | 具身機器人不應只是單純地依照指令執行,因為真實世界的環境常常充滿了意料之外的情況與例外。然而,現有的方法多半僅著重於直接執行指令,卻忽略了目標物件是否實際可操作,亦即缺乏對可用性的判斷能力。為了解決這一限制,我們提出 ADAPT,一個實用的基準測試(benchmark),用以挑戰機器人超越單純指令執行的能力。
在 ADAPT 中,物件可能處於不可使用的狀態,例如正在被使用或是髒污,且這些資訊並未明確地在指令中指出。機器人必須能夠感知環境、辨識物件狀態,並規劃出使物件回復可用狀態的行動,例如在將髒叉子放進杯子前先將其清潔。 為了實現這項能力,我們提出一個具備模組化設計、可獨立整合的架構,稱為可用性推理與動作感知記憶模組(Affordance Reasoner and Action-aware Memory, ARAM)。該模組能幫助機器人推理物件的可用性,並應用常識型規劃來偵測不可用狀態並產生替代方案。 當 ARAM 整合至 FILM 系統中時,在任務成功率上可達 70% 的相對提升,在目標完成度上可提升 33%,顯示其在動態環境中能顯著增強任務完成表現與規劃效率。 Intelligent embodied agents should not simply follow instructions, as real-world environments often involve unexpected conditions and exceptions. However, existing methods usually focus on directly executing instructions, without considering whether the target objects can actually be manipulated, meaning they lack the ability to assess available affordances. To address this limitation, we introduce ADAPT, a practical benchmark that challenges agents to go beyond simple instruction-following. In ADAPT, objects may be in unusable states, such as being in use or dirty, without this information being specified in the instructions. Agents must perceive the environment, recognize the object’s condition, and plan actions to restore the object to a usable state, such as cleaning a dirty fork before placing it into a cup. To enable this capability, we propose a plug-and-play module called Affordance Reasoner and Action-aware Memory (ARAM). The module helps the agent reason about object affordances and apply commonsense planning to detect unusable states and generate alternative solutions. When integrated with FILM, ARAM achieves up to 70% relative improvement in task success and 33% in goal completion, demonstrating its ability to enhance both task completion and planning efficiency in dynamic environments. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98918 |
| DOI: | 10.6342/NTU202503907 |
| 全文授權: | 同意授權(限校園內公開) |
| 電子全文公開日期: | 2025-08-26 |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 4.67 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
