請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97992| 標題: | 推動推薦系統的深度表徵學習:從序列到多模態與圖結構數據 Advancing Deep Representation Learning for Recommendation: From Sequential to Multimodal and Graph Structure Data |
| 作者: | 洪明邑 Ming-Yi Hong |
| 指導教授: | 林澤 Che Lin |
| 共同指導教授: | 王志宇 Chih-Yu Wang |
| 關鍵字: | 推薦系統,序列推薦,時間和語義訊息,趨勢感知,多模態序列推薦,時間對齊共享標籤,大型語言模型,異質資訊網路,圖神經網路,異構圖自動編碼器,圖資料增強,合成圖產生,可解釋人工智慧, Recommendation Systems,Sequential Recommendation,Temporal and Contextual Information,Trend-aware,Multimodal Sequential Recommendation,Time-aligned Shared Token,Large Language Model,Heterogeneous Information Network,Graph Neural Network,Heterogeneous Graph Autoencoder,Graph Data Augmentation,Synthetic Graph Generation,Explainable Artificial Intelligence, |
| 出版年 : | 2025 |
| 學位: | 博士 |
| 摘要: | 隨著電子商務與社交網路的快速發展,並受到多樣化數據模式的驅動,推薦系統的演進顯著加速。傳統的序列推薦模型主要專注於使用者的互動序列,但往往難以完整捕捉使用者偏好以及新興趨勢的影響。隨著使用者互動形式日益多樣化,包括文字、圖像和影片,對於多模態模型的需求變得愈發重要。這些模型能夠進行更豐富的特徵提取與跨模態理解,有效縮短個人化內容推薦的落差。同時,社交網路正在重新塑造數位互動方式,對於複雜互動關係的深度理解成為關鍵。圖神經網路作為強大的工具,能夠有效建模在這些複雜的圖結構資料上。然而,在異質資訊網路中,如何有效使用並整合各種類型節點和連線資訊對此領域帶來許多挑戰性。設計一個穩健且通用的異構圖神經網路是一個重要且廣泛追求的研究目標。此外,隨著圖神經網路的廣泛使用在社群分析,我們對模型的解釋和理解也變得越來越重要,針對模型解釋性的研究也因應而生,準備可靠的基準圖資料集的需求隨之增加,特別是在異質資訊網路資料集尤其匱乏。
為了解決上述問題。第一、我們提出了一種時間與情境趨勢感知的Transformer模型,能夠捕捉時間序列、情境點擊行為以及新興趨勢,以優化點擊率(CTR)的預測。第二、我們透過時間對齊共享標籤框架,整合多模態資訊,包括文字、圖像與價格,實現序列推薦的進一步提升,確保時間一致性並改善推薦準確度。第三、針對圖結構的表徵學習任務,我們提出了一種類型感知的異質圖自動編碼器及增強策略,透過優化邊緣連接與強化抗噪性來提升表徵學習能力。第四、我們系統性的生成合成的異質資訊網路並帶有解釋性答案,作為評估圖學習與解釋性的強健基準,並解決多樣化異質資料集匱乏的問題。 透過我們的創新設計,我們推動了推薦系統深度表徵學習的邊界,充分利用序列與多模態資訊。此外,我們成功實現了對多樣化的異質資訊網路進行有效的生成和半監督學習,有效解決異質資訊網路資料集的匱乏並加強異構圖神經網路學習能力。 The rapid growth of e-commerce and social networks has propelled the evolution of recommendation systems, driven by the availability of diverse data modalities. Traditional sequential recommendation models primarily focus on user interaction sequences, yet they often fall short in capturing the full complexity of user preferences and emerging trends. As user interactions diversify—encompassing text, images, and videos—the need for multimodal models becomes critical. These models enable richer feature extraction and cross-modal understanding, bridging the gap in personalized content delivery. Simultaneously, social networks are reshaping digital interactions, demanding a deeper understanding of complex interaction relationships. Graph Neural Networks (GNNs) have emerged as powerful tools for modeling these intricate structures, enhancing representation learning and prediction accuracy. However, the effective use and integration of various node and edge types in heterogeneous information networks (HINs) can be challenging. Designing a robust and general heterogeneous GNN is a vital and widely pursued research objective. Furthermore, as the interpretation and understanding of GNNs become more important, there is a growing need for reliable benchmarks and extensive graph datasets, especially in the context of HINs. To address the above issues, first, we proposed a temporal and contextual trend-aware transformer model that can capture temporal and contextual click behaviors and emerging trends to optimize CTR prediction. Second, we leverage a Time-aligned Shared Token (TST) framework to integrate multimodal information, including text, images, and prices, achieving further improvements in sequential recommendation, ensuring temporal consistency, and enhancing recommendation accuracy. Third, for graph-based representation learning tasks, we propose a type-aware heterogeneous graph autoencoder and augmentation strategy that enhances representation learning by optimizing edge connections and strengthening noise resistance. Fourth, we systematically generate synthetic HINs with explanation ground truths to serve as robust baselines for evaluating graph learning and explainability, addressing the scarcity of diverse heterogeneous datasets. Through our innovative design, we push the boundaries of deep representation learning for recommendation systems, leveraging sequential and multimodal information. Furthermore, we successfully achieved effective generation and semi-supervised learning for diverse HINs, effectively addressing the scarcity of HIN datasets and enhancing the learning capabilities of HGNNs. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97992 |
| DOI: | 10.6342/NTU202501410 |
| 全文授權: | 同意授權(全球公開) |
| 電子全文公開日期: | 2030-06-30 |
| 顯示於系所單位: | 資料科學學位學程 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 此日期後於網路公開 2030-06-30 | 11.08 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
