請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72323
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 黃鐘揚(Chung-Yang Ric Huang) | |
dc.contributor.author | Ko-Ching Liang | en |
dc.contributor.author | 梁可擎 | zh_TW |
dc.date.accessioned | 2021-06-17T06:35:25Z | - |
dc.date.available | 2018-08-21 | |
dc.date.copyright | 2018-08-21 | |
dc.date.issued | 2018 | |
dc.date.submitted | 2018-08-15 | |
dc.identifier.citation | [1] L. van der Maaten, G.E. Hinton, “Visualizing data using t-SNE”, Journal of Machine Learning Research, 2008
[2] G.E. Hinton, R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks”, Science, 2006 [3] J. Xie, R. Girshick, and A. Farhadi, “Unsupervised deep embedding for clustering analysis”, International Conference on Machine Learning, 2015 [4] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition.' Proceedings of the IEEE, November 1998. [5] A. Krizhevsky, “Learning multiple layers of features from tiny images”, Master’s thesis, Department of Computer Science, University of Toronto, 2009. [6] C.A. Sugar, and G.M. James, “Finding the number of clusters in a dataset”, Journal of the American Statistical Association, 2003. [7] G.E. Hinton, “Learning distributed representations of concepts”, Proceedings of the 8th Annual Conference of the Cognitive Science Society, 1986. [8] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model”, Journal of Machine Learning Research, 2003. [9] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: a review and new perspectives”, Pattern Analysis and Machine Intenlligence, 2013. [10] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimations of word representations in vector space”, International Conference on Learning Representations Workshop, 2013. [11] A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jegou, and T. Milolov, “FastText.zip: compressing text classification models”, arXiv preprint arXiv:1612.03651, 2016. [12] S.R. Bowman, L. Vilnis, O. Vinyals, A.M. Dai, R. Jozefowicz, and S. Bengio, “Generating sentences from a continuous space”, arXiv preprint arXiv:1511.06349, 2015. [13] L. van der Maaten, “Learning a parametric embedding by preserving local structure”, International Conference on Artificial Intelligence and Statistics, 2009. [14] Google, “word2vec”, https://code.google.com/archive/p/word2vec/. [15] R. Pascanu, T. Mikolov and Y. Bengio, “On the difficulty of training Recurrent Neural Networks”, International Conference on Machine Learning, 2013. [16] S. Hochreiter, “Untersuchungen zu dynamischen neuronalen Netzen”, Diploma thesis. Institut f. Informatik, Technische Univ. Munich, 1991. [17] Y. Bengio, P. Simard, P. Frasconi, “Learning long-term dependencies with gradient descent is difficult”, IEEE Transaction of Neural Networks, 1994. [18] W. Zaremba, I. Sutskever and O. Vinyals, “Recurrent neural network regularization”, arXiv preprint arXiv:1409.2329, September 2014. [19] C. Laurent, G. Pareyra, P. Brakel, Y. Zhang and Y. Bengio, “Batch normalized recurrent neural networks”, arXiv preprint arXiv:1510.01378, 2015 [20] A. V. D. Oord, N. Kalchbrenner and K. Kavukcuoglu, “Pixel recurrent neural network”, arXiv preprint arXiv:1601.06759, 2016. [21] Z. Yang, Z. Hu, R. Salakhutdinov and T. Berg-Kirkpatrick, “Improved variational autoencoders for text modeling using dilated convolutions”, arXiv preprint arXiv:1702.08139, 2017. [22] D. Berthelot, C. Raffel, A. Roy and I. Goodfellow, “Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer”, arXiv preprint arXiv:1807.07543, 2018. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72323 | - |
dc.description.abstract | 非監督學習為現今機器學習中相當熱門的領域。對於自然語言處理方面的問題,在監督式學習法取得相當的成功同時,非監督式方法仍有許多挑戰及困難。本論為旨在提出一基於特徵提取之非監督學習模型,針對文字序列資料集優化其分群結果。實驗結果顯示出保留文字序列資訊與特徵對於分群結果有效產生影響並提升結果。 | zh_TW |
dc.description.abstract | Unsupervised clustering is a popular research domain in machine learning on natural language while supervised methods like intent classification achieve remarkably good results in text data. A latent variable based model is introduced to improve the clustering results on NLP tasks. We exploit recurrent auto-encoder to obtain latent representations of sentences, then iteratively optimize an objective which learns feature representations and clustering assignments simultaneously. Experiments on several datasets show the need for preserving sequence information and the effectiveness of the proposed method. | en |
dc.description.provenance | Made available in DSpace on 2021-06-17T06:35:25Z (GMT). No. of bitstreams: 1 ntu-107-R04943151-1.pdf: 3480838 bytes, checksum: 02d43bf873f64765863dd60fa7a4bc17 (MD5) Previous issue date: 2018 | en |
dc.description.tableofcontents | 口試委員會審定書 #
誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vii LIST OF TABLES viii Chapter 1 Introduction 1 1.1 Motivation 2 1.2 Contribution of the Thesis 2 1.3 Organization of the Thesis 3 Chapter 2 Background and Related Works 4 2.1 Unsupervised Clustering 4 2.2 Learning Representation of Data 5 2.3 t-Distributed Stochastic Neighbor Embedding (t-SNE) 6 2.4 Kullback-Leibler (KL) Divergence 7 2.5 Deep Auto-encoder 8 2.6 Long-Short Term Memory (LSTM) Network 8 Chapter 3 Deep Embedding Clustering Framework (DEC) 11 3.1 Deep Embedding Clustering 11 3.1.1 Preliminaries and Notations 11 3.1.2 Intuition of Deep Embedding Clustering 11 3.2 Understading the Objectives of DEC 12 3.2.1 Soft Assignment 13 3.2.2 KL divergence minimization 13 3.3 Training Flow of DEC 14 Chapter 4 Optimizing Unsupervised Text Clustering 17 4.1 Optimizing Text Clustering with DEC 17 4.2 LSTM auto-encoder 18 4.3 Word Embedding Weight Initialization 19 4.4 Dropout and Batch Normalization 20 Chapter 5 Experimental Results 22 5.1 Dataset 22 5.2 Experiment Setup 26 5.2.1 Evaluation Metric 26 5.2.2 Experiment Environment 26 5.2.3 Implementation Details 27 5.3 Experimental Results 27 5.3.1 Experiments on Auto-encoder + k-means 27 5.3.2 Improvement of LSTM with Weight Initialization 28 5.3.3 Experiments on DEC 28 5.3.4 Experiments on Different Dropout Rates 30 5.3.5 Latent Visualization 30 5.3.6 Comparison with non-recurrent model 33 5.4 Case Study 34 5.4.1 Bag-of-words (BOW) Model vs. Sequence Model 34 5.4.2 Evaluation of Latent Space 35 Chapter 6 Conclusion and Future Works 37 REFERENCE 38 | |
dc.language.iso | en | |
dc.title | 利用深度嵌入向量模型的非監督式文字分群方法 | zh_TW |
dc.title | Unsupervised Text Clustering with Deep Embedding Model | en |
dc.type | Thesis | |
dc.date.schoolyear | 106-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 于天立(Tian-Li Yu),周俊男(Chun-Nan Chou) | |
dc.subject.keyword | 機器學習,深度學習,自然語言處理,分群, | zh_TW |
dc.subject.keyword | machine learning,deep learning,natural language processing,clustering, | en |
dc.relation.page | 40 | |
dc.identifier.doi | 10.6342/NTU201803637 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2018-08-16 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電子工程學研究所 | zh_TW |
顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-107-1.pdf 目前未授權公開取用 | 3.4 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。