多動作情境式拉霸問題之研究

Ya-Hsuan Chang; 張雅軒

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/17252

標題:	多動作情境式拉霸問題之研究 Study on Contextual Bandit Problem with Multiple Actions
作者:	Ya-Hsuan Chang 張雅軒
指導教授:	林軒田(Hsuan-Tien Lin)
關鍵字:	機器學習,情境式拉霸問題,信心值上界, Machine Learning,Contextual Bandit Problem,Upper Confidence Bound,
出版年 :	2013
學位:	碩士
摘要:	情境式拉霸問題 (Contextual Bandit Problem) 經常被使用來模擬線上的應用,像是文章推薦系統。然而,我們觀察到這些線上應用有部分的特性是傳統的情境式拉霸問題無法模擬的,像是單回合多動作的設定。於是我們提出一個新的多動作情境式拉霸問題 (Contextual Bandit with Multiple Actions) 來模擬這個特性。我們將一些現有的方法調整後用在這個新問題上,同時我們也針對新問題的特性提出了偶式回歸配合最高信心上界方法 (Pairwise Regression with Upper Confidence Bound). 實驗的結果顯示我們提出的新方法表現的比現有的方法好。 The contextual bandit problem is usually used to model online applications like article recommendation. Somehow the problem cannot fully meet some needs of these applica- tions, such as making multiple actions at the same time. We propose a new Contextual Bandit Problem with Multiple Ac- tions (CBMA), which is an extension of the traditional con- textual bandit problem and fits the online applications better. We adapt some existing contextual bandit algorithms for our CBMA problem, and propose a new Pairwise Regression with Upper Confidence Bound (PairUCB) algorithm which utilizes the new properties of the CBMA problem, The experiment re- sults demostrate that PairUCB outperforms other algorithms.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/17252
全文授權:	未授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 未授權公開取用	610.74 kB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。