強化學習用於發明自旋冰模型上的蒙地卡羅演算法

Kai-Wen Zhao; 趙愷文

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71053

標題:	強化學習用於發明自旋冰模型上的蒙地卡羅演算法 Discover Monte Carlo Algorithm on Spin Ice Model Using Reinforcement Learning
作者:	Kai-Wen Zhao 趙愷文
指導教授:	高英哲(Ying-Jer Kao)
關鍵字:	深度學習,強化學習,蒙地卡羅演算法,自旋冰模型, Reinforcement Learning,Deep Learning,Monte Carlo Algorithm,Spin Ice Model,
出版年 :	2018
學位:	碩士
摘要:	強化學習具備了在動態環境中卓越的探索與決策能力,成為機器學習中快速發展的研究領域。強化學習受到了心理學習的啟發,其理論架構中包含一個具有可改進策略能力的機器代理人。代理人使用現有策略,對環境採取行動並根據收到的回饋,進一步改善策略,從不斷嘗試與修正的過程中達成目標。在這篇論文中,我們將利用強化學習,讓代理人自我創造出在自旋冰模型上的蒙地卡羅演算法。自旋冰是一種磁性挫折系統,在低能量時具有強烈的拓撲拘束條件。在物理學中,迴路蒙地卡羅演算法可以很有效率的更新系統而不會破壞其局部的拓樸約束。但是,有效率的更新演算法往往是問題相依的,當面對新的系統時並需要設計新的演算法。因此,我們開發了一種基於強化學習的架構, 利用深度神經網路對蒙地卡羅狀態轉換子建模。並將馬可夫鏈推廣為馬可夫決策過程,使得機器代理人能在與物理系統交互作用中創造出有效率的更新策略。並且我們相信,本演算法可以作為收尋蒙地卡羅更新方法的通用架構。 Reinforcement learning is a fast-growing research field due to its outstanding exploration capability in dynamic environments. Inspired by psychological learning theories, the reinforcement learning framework contains a software agent with improvable policies that takes actions on the environment and attempts to achieve the goal according to given reward. A policy is a stochastic rule which governs the decision-making process of the agent and is updated based on the response of the environment. In this work, we apply reinforcement learning framework on the spin ice model. Spin ice is a frustrated magnetic system with strong topological constraint on the low-energy configurations. In the physics community, it is well-known that the loop Monte Carlo algorithm can update the system efficiently without breaking its local constraint. However, from a broader perspective, the global update schemes can be problem-dependent and require customized algorithm design. Therefore, we exploit a reinforcement learning method that parameterize transition operator with neural networks. By extending the Markov chain to Markov decision process, the algorithm can adaptively search for global update policy through its interactions with the physical model. It may serve as a general framework for the search of update patterns.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71053
DOI:	10.6342/NTU201802242
全文授權:	有償授權
顯示於系所單位：	物理學系

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 未授權公開取用	2.48 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。