以生成對抗網路達成非監督式文章摘要及主題模型

Yau-Shian Wang; 王耀賢

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/629

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李琳山(Lin-shan Lee)
dc.contributor.author	Yau-Shian Wang	en
dc.contributor.author	王耀賢	zh_TW
dc.date.accessioned	2021-05-11T04:50:18Z	-
dc.date.available	2020-02-13
dc.date.available	2021-05-11T04:50:18Z	-
dc.date.copyright	2020-02-13
dc.date.issued	2019
dc.date.submitted	2020-02-10
dc.identifier.citation	[1] Chih-Chung Chang and Chih-Jen Lin, “Libsvm: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., 2011. [2] J. R. Quinlan, “Induction of decision trees,” Mach. Learn., 1986. [3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, “Bert: Pretraining of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018. [4] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30. 2017. [5] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative adversarial networks,” arXiv preprint arXiv:1406.2661, 2014. [6] Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel, “Infogan: Interpretable representation learning by information maximizing generative adversarial nets,” in Advances in Neural Information Processing Systems 29. Curran Associates, Inc., 2016. [7] Sepp Hochreiter and Jürgen Schmidhuber, “Long short-term memory,” Neural Comput., 1997. [8] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, “Sequence to sequence learning with neural networks,” arXiv preprint arXiv:1409.3215, 2014. [9] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv::1409.0473, 2014. [10] Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly, “Pointer networks,” NIPS, 2015. [11] Abigail See, Peter J. Liu, and Christopher D. Manning, “Get to the point: Summarization with pointer-generator networks,” ACL, 2017. [12] Diederik P Kingma and Max Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013. [13] Martin Arjovsky, Soumith Chintala, and L´eon Bottou, “Wasserstein gan,” arXiv preprint arXiv:1701.07875, 2017. [14] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville, “Improved training of wasserstein gans,” arXiv preprint arXiv:1704.00028, 2017. [15] Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu, “Seqgan: Sequence generative adversarial nets with policy gradient,” AAAI, 2017. [16] Steven J. Rennie, Etienne Marcheret, Youssef Mroueh, Jarret Ross, and Vaibhava Goel, “Self-critical sequence training for image captioning,” CVPR, 2017. [17] Alexander M. Rush, Sumit Chopra, and Jason Weston, “A neural attention model for abstractive sentence summarization,” EMNLP, 2015. [18] Sumit Chopra, Michael Auli, and Alexander M. Rush, “Abstractive sentence summarization with attentive recurrent neural networks,” HLT-NAAC, 2016. [19] Qingyu Zhou, Nan Yang, Furu Wei, and Ming Zhou, “Selective encoding for abstractive sentence summarization,” ACL, 2017. [20] Yishu Miao and Phil Blunsom, “Language as a latent variable: Discrete generative models for sentence compression,” EMNLP, 2016. [21] Ramesh Nallapati, Bowen Zhou, Cicero Nogueira dos santos, Caglar Gulcehre, and Bing Xiang, “Abstractive text summarization using sequence-to-sequence rnns and beyond,” EMNLP, 2016. [22] Chin-Yew Lin, “Rouge: A package for automatic evaluation of summaries,” In Text summarization branches out: ACL workshop, 2004. [23] Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L. Boyd-graber, and David M. Blei, “Reading tea leaves: How humans interpret topic models,” in Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta, Eds., pp. 288–296. Curran Associates, Inc., 2009. [24] Sergey Ioffe and Christian Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015. [25] Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther, “Autoencoding beyond pixels using a learned similarity metric,” arXiv preprint arXiv:1512.09300, 2015. [26] Huaibo Huang, Zhihang Li, Ran He, Zhenan Sun, and Tieniu Tan, “Introvae: Introspective variational autoencoders for photographic image synthesis,” arXiv preprint arXiv:1807.06358, 2018. [27] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in Neural Information Processing Systems 26, pp. 3111–3119. 2013. [28] Dat Quoc Nguyen, Richard Billingsley, Lan Du, and Mark Johnson, “Improving topic models with latent feature word representations,” Transactions of the Association for Computational Linguistics, pp. 299–313, 2015. [29] Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, H´erve J´egou, and Tomas Mikolov, “Fasttext.zip: Compressing text classification models,” arXiv preprint arXiv:1612.03651, 2016. [30] Xiang Zhang and Yann LeCun, “Text understanding from scratch,” arXiv preprint arXiv:1502.01710, 2015. [31] Jiaming Xu, Peng Wang, Guanhua Tian, Bo Xu, Jun Zhao, Fangyuan Wang, and Hongwei Hao, “Short text clustering via convolutional neural networks,” in Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 2015, pp. 62–69, Association for Computational Linguistics. [32] Yishu Miao, Lei Yu, and Phil Blunsom, “Neural variational inference for text processing,” in Proceedings of the 34th International Conference on Machine Learning, 2016. [33] Akash Srivastava and Charles Sutton, “Autoencoding variational inference for topic models,” in International Conference on Learning Representations, 2017. [34] David M Blei, Andrew Y Ng, and Michael I Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003. [35] Matteo Pagliardini, Prakhar Gupta, and Martin Jaggi, “Unsupervised learning of sentence embeddings using compositional n-gram features,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018, pp. 528–540, Association for Computational Linguistics. [36] Michael Roder, Andreas Both, and Alexander Hinneburg, “Exploring the space of topic coherence measures,” WSDM, 2015.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/handle/123456789/629	-
dc.description.abstract	隨著網際網路的興起，人類在網路上留下各式各樣的資料，由於這些資料大多是未標註的，使用未標註資料來做訓練的非監督式學習成了近年來重要的研究課題。在本論文中，我們使用生成對抗網路來探索非監督式學習在自然語言處理上的可能性，並專注在在兩個不同的主題上。第一個主題是非平行抽象式文章摘要，亦即不需要平行成對的訓練文章搭配其人類撰寫的摘要便可訓練機器撰寫文章的非抽象式摘要。在這個主題中，我們使用摘要來作為文章自編碼器的潛在表徵，並且使用生成對抗網路來限制此潛在表徵必須具備人類可讀的形式，只要提供較少量的人類撰寫的不相關的內容的文章摘要作為辨識器的範本就可讓機器學習人類是如何寫摘要的。我們衡量我們所提出的模型在英文以及中文的新聞摘要資料庫上，模型的表現也驗證了這樣的方法的可行性。第二個主題則是非監督式文章主題模型，希望機器可以自動發現文章的接近人類認知的主題。我們使用資訊生成對抗網路來模擬文章的產生是由一個離散的主題分佈，以及一個連續的向量來控制主題下的文章的變異，而不若前人所提出的主題模型模擬文章的產生是由若干瑣碎的次要主題所產生。實驗顯示我們的模型在文章分類的任務上，以及所抽取出的每一個主題的關鍵詞的品質上，相較於先前的研究結果均有著顯著的進步。	zh_TW
dc.description.abstract	With the development of the Internet, humans put various data on the Internet. As most of the data is unannotated, how to efficiently utilize unlabeled data for unsupervised learning, becomes an important research direction. In this thesis, we use Generative Adversarial Network (GAN) to explore the possibility of unsupervised learning on NLP, which mainly covers the two different topics. The first topic is unsupervised abstractive text summarization. That is text summarization without any paired data. We use summaries as latent representations of an auto-encoder and use GAN to constrain the latent representation to be human-readable. WIth fewer summaries as examples for discriminator, machine can understand how humans write summaries for documents. The results on English and Chinese news datasets demonstrate the effectiveness of our model. The second topic is unsupervised topic model. The goal of this section is to train a machine that is able to automatically discover the latent topics similar to humans' cognition. Unlike prior topic models which models text generated from a mixture of sub-topics, we utilize InfoGAN to model texts generated from a discrete code controlling high-level topics and a continuous vector controlling variance within the topics. Compared to prior works, our proposed method greatly improves the performance on unsupervised classification and topical word extraction.	en
dc.description.provenance	Made available in DSpace on 2021-05-11T04:50:18Z (GMT). No. of bitstreams: 1 ntu-108-R06944019-1.pdf: 2995290 bytes, checksum: 1b0ee5803716f55fa4dfcb98173de28a (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	口試委員會審定書. . . . i 中文摘要. . . . ii 一、導論. . . . 1 1.1 研究動機. . . 1 1.2 研究方向. . . . 5 1.3 章節安排. . . . 5 二、背景知識. . . . 7 2.1 深層類神經網路(Deep Neural Network) . . . . 7 2.1.1 模型架構. . . . 7 2.1.2 類神經網路的訓練. . . . 8 2.2 遞迴式類神經網路. . . . 10 2.2.1 基本模型. . . . 10 2.2.2 長短期記憶模型(Long Short-Term Memory, LSTM) . . . .10 2.2.3 序列至序列網路. . . . 11 2.2.4 專注式序列至序列網路. . . . 11 2.2.5 混合式指標網路. . . . 12 2.3 自編碼器(Autoencoder) . . . . 13 2.3.1 基本原理. . . . 13 2.3.2 變分自編碼器(Variational Autoencosder, VAE) . . . . 14 2.4 生成對抗網路(Generative Adversarial Neural Networks, GAN) . . . . . 15 2.4.1 基本原理. . . . 15 2.4.2 使用生成對抗網路產生語言. . . . 16 2.4.3 資訊生成對抗網路(Info-GAN) . . . . 20 2.5 本章總結. . . . . 20 三、使用生成對抗網路達成非監督式抽象文章摘要. . . . 21 3.1 任務簡介. . . . . 21 3.2 訓練方法. . . . 22 3.2.1 方法概述. . . . 22 3.2.2 最小化重構損失. . . . 24 3.2.3 生成對抗網路的訓練. . . . 25 3.3 實驗. . . . 29 3.3.1 資料介紹. . . . 29 3.3.2 實做細節. . . . 30 3.3.3 評量方法. . . . 31 3.3.4 非平行式摘要用於英文十億詞. . . . 32 3.3.5 半監督式摘要於英文十億詞. . . . 34 3.3.6 遷移式學習於英文十億詞. . . . . 35 3.3.7 非平行式摘要用於有線電視日常信件. . . . 36 3.3.8 非平行式摘要用於中文十億詞. . . . . 37 3.4 本章總結. . . . 38 四、使用資訊生成對抗網路達成文章主題模型. . . . 42 4.1 任務簡介. . . . 42 4.2 訓練方法. . . . 43 4.2.1 方法簡介. . . . 43 4.2.2 模型介紹. . . . . 44 4.2.3 模型的訓練. . . . 46 4.2.4 由產生的詞袋向量產生文章. . . . 48 4.3 實驗. . . . 48 4.3.1 資料介紹. . . . 48 4.3.2 模型架構與參數. . . . 50 4.3.3 非監督式文章分類. . . . 50 4.3.4 主題一致性. . . . 52 4.3.5 切除分析(Ablation Study) . . . . 56 4.3.6 分離式表徵. . . . 58 4.4 本章總結. . . . 59 五、結論與展望. . . . 65 5.1 結論與主要貢獻. . . . 65 5.2 未來展望. . .. . 67 5.2.1 使用兩階段式的方法達成非平行式文章摘要. . . . 67 5.2.2 使用遞迴式類神經網路的文章主題模型. . . . 67 參考文獻. . . . 68
dc.language.iso	zh-TW
dc.subject	非監督式學習	zh_TW
dc.subject	文章摘要	zh_TW
dc.subject	主題模型	zh_TW
dc.subject	生成對抗網路	zh_TW
dc.subject	Unsupervised learning	en
dc.subject	GAN	en
dc.subject	Topic model	en
dc.subject	Text summarization	en
dc.title	以生成對抗網路達成非監督式文章摘要及主題模型	zh_TW
dc.title	Unsupervised Text Summarization and Topic Modeling using Generative Adversarial Networks	en
dc.date.schoolyear	108-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳信宏,鄭秋豫,王小川,李宏毅
dc.subject.keyword	非監督式學習,文章摘要,主題模型,生成對抗網路,	zh_TW
dc.subject.keyword	Unsupervised learning,Text summarization,Topic model,GAN,	en
dc.relation.page	72
dc.identifier.doi	10.6342/NTU202000333
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2020-02-10
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf	2.93 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。