請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74036
標題: | 在用於推薦系統的隱含回饋中對於資料過散佈之減緩 Alleviation of Data Overdispersion On Implicit Feedback for Recommender Systems |
作者: | Li-Yen Kuo 郭立言 |
指導教授: | 陳銘憲(Ming-Syan Chen) |
關鍵字: | 推薦系統,帕松分佈,負二項式分佈,排序學習,隱含回饋, recommender systems,Poisson distribution,negative binomial distribution,learning to rank,implicit feedback, |
出版年 : | 2019 |
學位: | 博士 |
摘要: | 矩陣分解在推薦系統中已然贏得了碩大的成功。在真實世界的隱含回饋中,矩陣元素值約略符合冪定律分佈。確切言之,許多矩陣元素的值是異常高的,這些值在本文中以過散佈稱之。一般使用的矩陣分解植基於回歸分析,其不僅易受過散佈資料影響,更無法確保預測值能與使用者喜好次序相符。有鑑於此,我們提出了兩種觀點以減緩資料過散佈所造成的影響。第一個觀點以排序學習為基礎。我們提出了一個於帕松分解上個人化排序的框架,捨棄傳統回歸分析上的後驗機率,轉而利用植基於排序學習的後驗機率。因個人化排序與帕松分解之結合,該框架不僅保存了用戶的喜好,在稀疏矩陣上亦有良好的表現。也因結合排序學習與帕松分解,致使後驗機率無法符合共軛先驗,我們對變分參數的估計加以近似,從而提出兩個植基於變分推論的最佳化方式。不論使用的排序學習模型為何,只要模型可得出一階與二階導數,經過該框架,本文提出的最佳化演算法皆可以對後驗機率進行最大化。第二個觀點是對於失敗接觸的考量。在推薦系統中,當一位用戶接觸一件商品時,用戶可能會消費該商品(稱為成功接觸),也可能略過該商品(稱為失敗接觸)。我們提出一個新方法,階層式負二項式分解,其以階層式的貝葉斯結構對資料散佈性進行建模,而非僅將散佈性的先驗分佈視為固定的常數,透過減緩資料過散佈的影響,從而助於推薦效果之提升。此外,我們將矩陣中零元素的資料散佈性近似地分解為兩個低秩的矩陣,致使每時期的計算成本可降低為正比於非零元素的數量。在實驗中,以準確率與召回率為量測基準,在推薦系統的隱含回饋資料上,我們提出的方法勝過現今最新技術。 Matrix factorization has earned great success on recommender systems. In real-world implicit feedback, values of entries follow power-law distributions approximately. More specifically, several entries have extraordinary high values, which are called overdispersed ones in this dissertation. The common-used regression-based matrix factorization not only is sensitive to overdispersed data but also unable to guarantee that the predicted values are coordinate with the user preference orders. In light of this, we propose two perspectives for the alleviation of the effect of data overdispersion. The first perspective is on the basis of learning to rank. We propose a framework for personalized ranking of Poisson factorization, which utilizes learning-to-rank based posteriori instead of the classical regression-based ones. Owing to the combination, the proposed framework not only preserves user preference but also performs well on a sparse matrix. Since the posteriori that combines learning to rank and Poisson factorization does not follow the conjugate prior relationship, we estimate variational parameters approximately and propose two optimization approaches based on variational inference. As long as the used learning-to-rank model has the 1st and 2nd order partial derivatives, by exploiting our framework, the proposed optimizing algorithm can maximize the posteriori whichever the used learning-to-rank model is. The second perspective is the consideration of failure exposure. When being exposed to an item in a recommender system, a user may consume the item (as known as success exposure) or may neglect it (as known as failure exposure). We propose a novel model, hierarchical negative binomial factorization (HNBF), which models dispersion by a hierarchical Bayesian structure rather than assigning a constant to the prior of dispersion directly, thus alleviating the effect of data overdispersion to help with performance gain for recommendation. Moreover, we factorize the dispersion of zero entries approximately into two low-rank matrices, thus limiting the computational cost of updating per epoch linear to the number of nonzero entries. In the experiment, we show that the proposed methods outperform the state-of-the-art ones in terms of precision and recall on implicit feedback in recommendation tasks. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74036 |
DOI: | 10.6342/NTU201903449 |
全文授權: | 有償授權 |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 1.88 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。