Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/1215
Title: 利用其他遊戲的經驗來加速新遊戲的訓練
LEVERAGE EXPERIMENTS FROM OTHER TASKS TO SPEED UP TRAINING
Authors: Ying Li
李盈
Advisor: 徐宏民(Winston Hsu)
Keyword: 強化學習,
Reinforcement Learning,
Publication Year : 2018
Degree: 碩士
Abstract: 雖然深度強化學習(RL)的方法已經在各種視頻遊戲上達到了令人印象深刻的成果,但是RL的訓練仍然需要許多時間和計算資源,基於通過隨機探索環境,並從對應的稀疏獎勵中提取信息非常困難。
已經有許多作品試圖通過利用以往經驗中的相關知識來加速強化學習過程。有些人認為這是一個轉移學習問題,試圖利用其他遊戲的相關知識。有些人認為這是一個多任務問題,試圖找到一些能夠推廣到新任務的表示方法。在這篇論文中,我們將agent與環境互動並收集經驗的過程,視作生成訓練數據的方式,我們需要讓訓練數據擁有更多的差異,以使訓練過程更有效。然後,我們嘗試將在其他遊戲環境中訓練的模型加載到我們想要訓練的新遊戲中,以便生成一些不同的訓練數據。結果表明,使用不同目標的其他遊戲而不是隨機採取行動的策略可以加快學習過程。
Although the deep reinforcement learning (RL) approach has achieved impressive results in a variety of video games, training by RL still requires a lot of time and computational resources since it is difficult to extract information from sparse reward by random exploration with the environment.
There have been many works attempts to accelerate the RL process by leveraging relevant knowledge from past experience. Some formulated this as a transfer learning problem, exploiting relevant knowledge from other games. Some formulated this as a multitasking problem, tried to find some useful representations which are capable of generalizing to new tasks. In this work, we treat the process the agent interacts with the environment and collects experience as a way to generate training data, which needs more variance to make the training process more efficient. We then try to load models trained on other game environments to the new game we want to train, in order to generate some different training data. The results show that use policy from other games with different goals instead of randomly taken action could speed up the learning process.
URI: http://tdr.lib.ntu.edu.tw/handle/123456789/1215
DOI: 10.6342/NTU201804197
Fulltext Rights: 同意授權(全球公開)
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-107-1.pdf4.63 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved