利用深度嵌入向量模型的非監督式文字分群方法

Ko-Ching Liang; 梁可擎

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72323

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	黃鐘揚(Chung-Yang Ric Huang)
dc.contributor.author	Ko-Ching Liang	en
dc.contributor.author	梁可擎	zh_TW
dc.date.accessioned	2021-06-17T06:35:25Z	-
dc.date.available	2018-08-21
dc.date.copyright	2018-08-21
dc.date.issued	2018
dc.date.submitted	2018-08-15
dc.identifier.citation	[1] L. van der Maaten, G.E. Hinton, “Visualizing data using t-SNE”, Journal of Machine Learning Research, 2008 [2] G.E. Hinton, R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks”, Science, 2006 [3] J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis”, International Conference on Machine Learning, 2015 [4] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition.' Proceedings of the IEEE, November 1998. [5] A. Krizhevsky, “Learning multiple layers of features from tiny images”, Master’s thesis, Department of Computer Science, University of Toronto, 2009. [6] C.A. Sugar, and G.M. James, “Finding the number of clusters in a dataset”, Journal of the American Statistical Association, 2003. [7] G.E. Hinton, “Learning distributed representations of concepts”, Proceedings of the 8th Annual Conference of the Cognitive Science Society, 1986. [8] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model”, Journal of Machine Learning Research, 2003. [9] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: a review and new perspectives”, Pattern Analysis and Machine Intenlligence, 2013. [10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimations of word representations in vector space”, International Conference on Learning Representations Workshop, 2013. [11] A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jegou, and T. Milolov, “FastText.zip: compressing text classification models”, arXiv preprint arXiv:1612.03651, 2016. [12] S.R. Bowman, L. Vilnis, O. Vinyals, A.M. Dai, R. Jozefowicz, and S. Bengio, “Generating sentences from a continuous space”, arXiv preprint arXiv:1511.06349, 2015. [13] L. van der Maaten, “Learning a parametric embedding by preserving local structure”, International Conference on Artificial Intelligence and Statistics, 2009. [14] Google, “word2vec”, https://code.google.com/archive/p/word2vec/. [15] R. Pascanu, T. Mikolov and Y. Bengio, “On the difficulty of training Recurrent Neural Networks”, International Conference on Machine Learning, 2013. [16] S. Hochreiter, “Untersuchungen zu dynamischen neuronalen Netzen”, Diploma thesis. Institut f. Informatik, Technische Univ. Munich, 1991. [17] Y. Bengio, P. Simard, P. Frasconi, “Learning long-term dependencies with gradient descent is difficult”, IEEE Transaction of Neural Networks, 1994. [18] W. Zaremba, I. Sutskever and O. Vinyals, “Recurrent neural network regularization”, arXiv preprint arXiv:1409.2329, September 2014. [19] C. Laurent, G. Pareyra, P. Brakel, Y. Zhang and Y. Bengio, “Batch normalized recurrent neural networks”, arXiv preprint arXiv:1510.01378, 2015 [20] A. V. D. Oord, N. Kalchbrenner and K. Kavukcuoglu, “Pixel recurrent neural network”, arXiv preprint arXiv:1601.06759, 2016. [21] Z. Yang, Z. Hu, R. Salakhutdinov and T. Berg-Kirkpatrick, “Improved variational autoencoders for text modeling using dilated convolutions”, arXiv preprint arXiv:1702.08139, 2017. [22] D. Berthelot, C. Raffel, A. Roy and I. Goodfellow, “Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer”, arXiv preprint arXiv:1807.07543, 2018.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72323	-
dc.description.abstract	非監督學習為現今機器學習中相當熱門的領域。對於自然語言處理方面的問題，在監督式學習法取得相當的成功同時，非監督式方法仍有許多挑戰及困難。本論為旨在提出一基於特徵提取之非監督學習模型，針對文字序列資料集優化其分群結果。實驗結果顯示出保留文字序列資訊與特徵對於分群結果有效產生影響並提升結果。	zh_TW
dc.description.abstract	Unsupervised clustering is a popular research domain in machine learning on natural language while supervised methods like intent classification achieve remarkably good results in text data. A latent variable based model is introduced to improve the clustering results on NLP tasks. We exploit recurrent auto-encoder to obtain latent representations of sentences, then iteratively optimize an objective which learns feature representations and clustering assignments simultaneously. Experiments on several datasets show the need for preserving sequence information and the effectiveness of the proposed method.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T06:35:25Z (GMT). No. of bitstreams: 1 ntu-107-R04943151-1.pdf: 3480838 bytes, checksum: 02d43bf873f64765863dd60fa7a4bc17 (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vii LIST OF TABLES viii Chapter 1 Introduction 1 1.1 Motivation 2 1.2 Contribution of the Thesis 2 1.3 Organization of the Thesis 3 Chapter 2 Background and Related Works 4 2.1 Unsupervised Clustering 4 2.2 Learning Representation of Data 5 2.3 t-Distributed Stochastic Neighbor Embedding (t-SNE) 6 2.4 Kullback-Leibler (KL) Divergence 7 2.5 Deep Auto-encoder 8 2.6 Long-Short Term Memory (LSTM) Network 8 Chapter 3 Deep Embedding Clustering Framework (DEC) 11 3.1 Deep Embedding Clustering 11 3.1.1 Preliminaries and Notations 11 3.1.2 Intuition of Deep Embedding Clustering 11 3.2 Understading the Objectives of DEC 12 3.2.1 Soft Assignment 13 3.2.2 KL divergence minimization 13 3.3 Training Flow of DEC 14 Chapter 4 Optimizing Unsupervised Text Clustering 17 4.1 Optimizing Text Clustering with DEC 17 4.2 LSTM auto-encoder 18 4.3 Word Embedding Weight Initialization 19 4.4 Dropout and Batch Normalization 20 Chapter 5 Experimental Results 22 5.1 Dataset 22 5.2 Experiment Setup 26 5.2.1 Evaluation Metric 26 5.2.2 Experiment Environment 26 5.2.3 Implementation Details 27 5.3 Experimental Results 27 5.3.1 Experiments on Auto-encoder + k-means 27 5.3.2 Improvement of LSTM with Weight Initialization 28 5.3.3 Experiments on DEC 28 5.3.4 Experiments on Different Dropout Rates 30 5.3.5 Latent Visualization 30 5.3.6 Comparison with non-recurrent model 33 5.4 Case Study 34 5.4.1 Bag-of-words (BOW) Model vs. Sequence Model 34 5.4.2 Evaluation of Latent Space 35 Chapter 6 Conclusion and Future Works 37 REFERENCE 38
dc.language.iso	en
dc.title	利用深度嵌入向量模型的非監督式文字分群方法	zh_TW
dc.title	Unsupervised Text Clustering with Deep Embedding Model	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	于天立(Tian-Li Yu),周俊男(Chun-Nan Chou)
dc.subject.keyword	機器學習,深度學習,自然語言處理,分群,	zh_TW
dc.subject.keyword	machine learning,deep learning,natural language processing,clustering,	en
dc.relation.page	40
dc.identifier.doi	10.6342/NTU201803637
dc.rights.note	有償授權
dc.date.accepted	2018-08-16
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 目前未授權公開取用	3.4 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。