利用其他遊戲的經驗來加速新遊戲的訓練

Ying Li; 李盈

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/1215

標題:	利用其他遊戲的經驗來加速新遊戲的訓練 LEVERAGE EXPERIMENTS FROM OTHER TASKS TO SPEED UP TRAINING
作者:	Ying Li 李盈
指導教授:	徐宏民(Winston Hsu)
關鍵字:	強化學習, Reinforcement Learning,
出版年 :	2018
學位:	碩士
摘要:	雖然深度強化學習（RL）的方法已經在各種視頻遊戲上達到了令人印象深刻的成果，但是RL的訓練仍然需要許多時間和計算資源，基於通過隨機探索環境，並從對應的稀疏獎勵中提取信息非常困難。已經有許多作品試圖通過利用以往經驗中的相關知識來加速強化學習過程。有些人認為這是一個轉移學習問題，試圖利用其他遊戲的相關知識。有些人認為這是一個多任務問題，試圖找到一些能夠推廣到新任務的表示方法。在這篇論文中，我們將agent與環境互動並收集經驗的過程，視作生成訓練數據的方式，我們需要讓訓練數據擁有更多的差異，以使訓練過程更有效。然後，我們嘗試將在其他遊戲環境中訓練的模型加載到我們想要訓練的新遊戲中，以便生成一些不同的訓練數據。結果表明，使用不同目標的其他遊戲而不是隨機採取行動的策略可以加快學習過程。 Although the deep reinforcement learning (RL) approach has achieved impressive results in a variety of video games, training by RL still requires a lot of time and computational resources since it is difficult to extract information from sparse reward by random exploration with the environment. There have been many works attempts to accelerate the RL process by leveraging relevant knowledge from past experience. Some formulated this as a transfer learning problem, exploiting relevant knowledge from other games. Some formulated this as a multitasking problem, tried to find some useful representations which are capable of generalizing to new tasks. In this work, we treat the process the agent interacts with the environment and collects experience as a way to generate training data, which needs more variance to make the training process more efficient. We then try to load models trained on other game environments to the new game we want to train, in order to generate some different training data. The results show that use policy from other games with different goals instead of randomly taken action could speed up the learning process.
URI:	http://tdr.lib.ntu.edu.tw/handle/123456789/1215
DOI:	10.6342/NTU201804197
全文授權:	同意授權(全球公開)
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf	4.63 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。