深度強化式學習在指數追蹤的應用

Wen-Hao Chung; 鍾文豪

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/1219

Title:	深度強化式學習在指數追蹤的應用 Deep Reinforcement Learning on Index-Tracking
Authors:	Wen-Hao Chung 鍾文豪
Advisor:	葉小蓁
Co-Advisor:	韓傳祥
Keyword:	指數追蹤,投資組合管理,強化式學習,策略梯度,追蹤偏離度, Index-tracking,Portfolio Management,Reinforcement Learning,Policy Gradient,Tracking Difference,
Publication Year :	2018
Degree:	碩士
Abstract:	指數追蹤是一種投資組合管理，藉由建構投資組合來追蹤特定指數的績效，同時極小化追蹤偏離度及追蹤誤差。如果我們知道指數的成分股及成分股的權重或者指數的編制規則，指數追蹤的問題就變得相當容易。如果上述的資訊全部是私有訊息呢？本文提出深度強化式學習方法，在不知道指數成分股、成分股的權重及指數編製規則的情況下，建立指數追蹤投資組合追蹤該指數。本文使用深度強化式學習中的策略梯度，來建構指數追蹤投資組合。策略梯度能夠將狀態訊息轉換成連續的動作，相較於深度Q-學習，更適合用來做投資組合管理。美國股票市場的所有普通股將作為強化式學習模型的輸入，用來追蹤股票指數（S&P500、NASDAQ Composite）及主動式基金（FSCSX、FBSOX、NASDX）。追蹤偏離度的均方是我們主要衡量指數追蹤的依據。實驗結果顯示，我們提出的深度強化式學習方法所建構的指數追蹤投資組合可以良好的追蹤標的。在樣本外測試期間，追蹤偏離度的均方至少可以達到2.71E-05 的水準。 An index-tracking problem is a kind of portfolio management, building a portfolio that tracks the performance of a certain index while minimizing the tracking difference and tracking error. If constituents of the index and portfolio weights of constituents are known or rules to build the index are public information, tracking index is trivial. What if the above information is private? In this paper, we propose a deep reinforcement learning method to build an index-tracking portfolio to track the index while not knowing the constituents of the index, portfolio weights of constituents or rules to build the index. Deep reinforcement learning (RL) with Policy Gradient is deployed to build the indextracking portfolio. Policy Gradient transform state information to continuous actions which is more suitable for portfolio management than the deep Q-learning. The whole U.S. equity will be put into the deep RL model to track the indexes (S&P 500 and NASDAQ Composite) or the active funds (NASDX, FSCSX and FBSOX). Mean square difference is used as our main measurement for index-tracking. The experiment result shows that the index-tracking portfolio build by the proposed RL method could excellently track the target. The mean square of tracking difference could at least achieve 2.71E-05 in the whole testing period.
URI:	http://tdr.lib.ntu.edu.tw/handle/123456789/1219
DOI:	10.6342/NTU201801196
Fulltext Rights:	同意授權(全球公開)
Appears in Collections:	財務金融學系

Files in This Item:

File	Size	Format
ntu-107-1.pdf	1.45 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets