電商商品圖像與文本的解耦表征之學習

Zhong-Yu Huang; 黃中余

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72750

Title:	電商商品圖像與文本的解耦表征之學習 Learning Disentangled Representation of Image and Text Data for E-commerce Products
Authors:	Zhong-Yu Huang 黃中余
Advisor:	林守德
Keyword:	表征學習,解耦表征,變分式隨機忽略,商品標題表征,商品相似度, Representation Learning,Disentangled Representation,Variational Dropout,Product Title Representation,Product Similarity,
Publication Year :	2019
Degree:	碩士
Abstract:	深度學習擅於生成分佈式表征，然而這類表征通常不易於被解釋。解耦式表征(Disentangled Representation)是一個新近被討論的概念，擁有模塊性、緊湊性和明晰性等特性，而且可以通過生成因素來解讀。本篇論文將利用來自於電商商品的平行化文本與圖像資料，訓練一個可以將分佈式商品標題表征轉換至解耦式表征的模型，並使轉後的表征可以被拆解為兩個模塊，其中一個模塊用於編碼商品圖像和標題共同傳遞的訊息，而另一個模塊則用於編碼無法從商品圖像，只能從商品標題中知曉或推斷的訊息。我們通過引入變分式隨機忽略 (Variational Dropout) 的方法來達成我們的目標，該方法同時也可以為我們提供從資料中學習得到的有價值的隨機忽略機率 (Dropout Rate)。實驗和評估的結果表明，轉換后的解耦式表征更善於判讀不同商品標題之間的相似度，而且不同的模塊在評估過程中展現了相對不同的特性，這也有利於發展更多的應用。我們也初步驗證了轉換后的表征能夠基本滿足解耦式表征所需要的特性。 Deep learning is good at generating distributed representaions, but they cannot be well interpreted. While disentangled representation is a recently discussed concept that features modularity, compactness and explicitness, which is explainable via the generating factors. This thesis makes use of aligned text and image data of E-commerce products to learn a model that can transfrom a product title representation to a disentangled one, which can be divided into two modules, one of them encodes the information that is commonly conveyed by the title and the image, while the other encodes the rest information cannot inferred from the image but only known from the title. We achieve our goal by injecting variational dropout, which also provides us meaningful dropout rates learned from the data. The experiment and evaluation results show that the transformed disentangled representations are good at calculating the similarity between different product titles, meanwhile, different sections of the representation show different patterns when doing the evaluation tasks, which might be useful for more applications. We also show that the properties of disentanlgement can be basically satisfied by our learning methods.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72750
DOI:	10.6342/NTU201901757
Fulltext Rights:	有償授權
Appears in Collections:	資訊網路與多媒體研究所

Files in This Item:

File	Size	Format
ntu-108-1.pdf Restricted Access	1.28 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets