利用語言引導強化式學習中的技能內插

林辰宇; Chen-Yu Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93415

標題:	利用語言引導強化式學習中的技能內插 Leveraging Language Guidance for Skill-Interpolation in Reinforcement Learning
作者:	林辰宇 Chen-Yu Lin
指導教授:	孫紹華 Shao-Hua Sun
關鍵字:	強化式學習,技能導向強化學習,語言模型,技能內插, Reinforcement Learning,Skill-Based Reinforcement Learning,Language Model,Skill Interpolation,
出版年 :	2024
學位:	碩士
摘要:	在強化學習（Reinforcement Learning）領域中，獲取新技能一直是一項長期挑戰，通常需要大量的訓練和探索。技能導向的強化學習雖然前景光明，但在有效轉移已學習的技能以獲取新技能方面遇到了障礙。本研究旨在通過引入一種稱為“技能內插”的新方法來彌補這一差距，並利用語言模型基於先前的學習經驗來引導生成新技能。與著重規劃和決策的傳統強化學習方法不同，我們的方法強調動作技能的內插。通過將語言模型整合到強化學習框架中，我們旨在使代理能夠在學習的技能之間無縫過渡，從而提高其在多樣環境中的適應性和效率。受語言模型在自然語言處理任務中豐富能力的啟發，我們提出了一種方法，其中智能體利用語義表示通過內插生成中間技能。舉例來說，我們探索了在跑步和跳躍之間內插生成慢跑或跑跳等場景中的技能內插應用。通過實驗的分析與證實，我們展示了新的方法使智能體更高效地獲取新技能方面的效果，充分利用從先前訓練經驗中獲得的知識。此外，我們分析了模型架構、訓練數據和自然語言等各種因素對技能內插效果的影響。總而言之，本研究通過引入一種利用語言模型促進技能與技能間的獲取和轉移的新範式，對技能導向的強化學習領域作出了貢獻。所提出的框架為增強智能體在動態環境中的適應性和自主性開闢了未來研究的途徑。 Within the reinforcement learning domain (RL), acquiring new skills has been a longstanding challenge, often requiring extensive training and exploration. Skill-based RL, while promising, has encountered hurdles in effectively transferring previously learned skills to acquire novel ones. This research endeavors to bridge this gap by introducing a novel approach termed ``skill-interpolation", which leverages language models to facilitate the generation of new skills based on prior learned experiences. Unlike traditional RL methods, which primarily focus on planning and decision-making, our approach emphasizes the interpolation of locomotion skills. By integrating language models into the RL framework, we aim to enable agents to seamlessly transition between learned skills, thereby enhancing their adaptability and efficiency in diverse environments. Drawing inspiration from the rich capabilities of language models in natural language processing tasks, we propose a methodology wherein agents utilize semantic representations to generate intermediate skills through interpolation. Specifically, we explore the application of skill-interpolation in scenarios such as interpolating running and jumping into jogging or galloping. Through empirical evaluations and experiments, we demonstrate the efficacy of our approach in enabling agents to acquire new skills more efficiently, leveraging the knowledge gained from previous training experiences. Furthermore, we analyze the impact of various factors such as model architecture, training data, and semantic representation on the effectiveness of skill-interpolation. Overall, this research contributes to advancing the field of skill-based reinforcement learning by introducing a novel paradigm that harnesses the power of language models to facilitate seamless skill acquisition and transfer. The proposed framework opens avenues for future research in enhancing agent adaptability and autonomy in dynamic environments.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93415
DOI:	10.6342/NTU202402114
全文授權:	同意授權(限校園內公開)
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	567.59 kB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。