大規模線性增強式學習與競賽式學習以五子棋為例

Kai-Min Chuang; 莊凱閔

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6676

標題:	大規模線性增強式學習與競賽式學習以五子棋為例 Large Scale Linear Reinforcement Learning and Tournament Learning Illustrated with Gomoku
作者:	Kai-Min Chuang 莊凱閔
指導教授:	林智仁(Chih-Jen Lin)
關鍵字:	大規模資料,線性支持向量回歸,增強式學習,競賽式學習,五子棋, Large-scale data,Linear support vector regression,Reinforcement Learning,Tournament Learning,Gomoku,
出版年 :	2012
學位:	碩士
摘要:	增強式學習曾經是一個熱門的研究主題，但至今增強式學習依然停留在原始段，即便最知名的例子，TD-Gammon，一個西洋雙陸棋代理人，仍需要藉由搜尋的技巧來提升棋力。在本篇論文中應用了監督式學習中的兩個重要技術，大幅提升增強式學習的能力，分別是大規模資料與線性支持向量回歸。此外，我們也討論了兩個獨立代理人是否能藉由不斷地競爭去增強彼此能力，我們稱這新的學習模式為競賽式學習。以上兩個概念將會以五子棋演示，結果顯示所產生的代理人不需藉由搜尋的技巧，也具有可與人類匹敵的能力，意味著這兩個概念是實際可行，對於棋類遊戲或是特定應用將會有很大的幫助。 Reinforcement learning has been a promising research topic.However, until now it almost stays at an initial stage. Even the most successful case, TD-Gammon also needs assisted by searching techniques. We design a new method to improve the learning ability of reinforcement learning. The core contributions are using large-scale data and linear support vector regression. In addition, we discuss that could two agents improve their abilities by competing with another. We call this framework as tournament learning. These two concepts would be illustrated with gomoku. The results, our agents, achieve a competitive level, and it implies our concepts are available and practical.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6676
全文授權:	同意授權(全球公開)
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-101-1.pdf	1.6 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。