以稀疏潛在概念層改善基於變分自編碼架構的神經網絡主題模型

Sheng-Yao Shen; 沈聖堯

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68912

Title:	以稀疏潛在概念層改善基於變分自編碼架構的神經網絡主題模型 Improving Variational Auto-Encoder Based Neural Topic Model with Sparse Latent Concept Layer
Authors:	Sheng-Yao Shen 沈聖堯
Advisor:	黃鐘揚(Chung-Yang Huang)
Keyword:	潛在狄利克雷分配,主題模型,變分自編碼器, Latent Dirichlet Allocation,Topic Model,Variational Auto-encoder,
Publication Year :	2017
Degree:	碩士
Abstract:	此論文主要貢獻為提出一個簡單的基於變分自編碼器的主題模型，並提出有效的主題字選擇方式。通過將機率矩陣分解為主題矩陣與文字矩陣的乘積，我們引入了潛在概念 (Sparse Latent Concept, SLC) 作為主題與文字的語意向量空間維度，並基於主題具有「潛在概念的稀疏性」的假設，和以主題與文字的語意相似度作為主題字的選擇函數。實驗結果顯示，基於SLC的模型具有更高的平均主題一致性 (topic coherence)。 In this thesis, the primary contribution is proposing a simple variational auto-encoder based topic model, and effective topic word selection criteria. By decomposing the probability matrix into the product of a topic matrix and a word matrix, we introduce sparse latent concepts (SLC) as the dimensionalities of the semantic space of the topic and word vectors, improve the model based on the idea that a topic is represented as few latent concepts, and select topic words by semantic similarity between topic and word vectors. In the experiments, SLC-based model outperforms the non-SLC-based model in terms of average topic coherence.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68912
DOI:	10.6342/NTU201702238
Fulltext Rights:	有償授權
Appears in Collections:	電機工程學系

Files in This Item:

File	Size	Format
ntu-106-1.pdf Restricted Access	2.67 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets