請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52042完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李宏毅(Hung-Yi Lee) | |
| dc.contributor.author | Shang-Ming Wang | en |
| dc.contributor.author | 王上銘 | zh_TW |
| dc.date.accessioned | 2021-06-15T14:04:52Z | - |
| dc.date.available | 2020-08-24 | |
| dc.date.copyright | 2020-08-24 | |
| dc.date.issued | 2020 | |
| dc.date.submitted | 2020-08-11 | |
| dc.identifier.citation | [1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin, “Attention is all you need,”CoRR, vol. abs/1706.03762, 2017. [2] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, Eds., pp. 3111–3119. Curran Associates, Inc., 2013. [3] Tomas Mikolov, Kai Chen, Greg S. Corrado, and Jeffrey Dean, “Efficient estimation of word representations in vector space,” 2013. [4] Alexis Conneau, Douwe Kiela, Holger Schwenk, Lo¨ıc Barrault, and Antoine Bordes,“Supervised learning of universal sentence representations from natural language inference data,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, September 2017, pp. 670–680, Association for Computational Linguistics. [5] Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher, “Learned in translation: Contextualized word vectors,” in Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., pp. 6294–6305. Curran Associates, Inc., 2017. [6] Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer,“Deep contextualized word representations,”CoRR, vol. abs/1802.05365, 2018. [7] Zichao Wang, Andrew Lan, Weili Nie, Andrew Waters, Phillip Grimaldi, and Richard Baraniuk, “Qg-net: a data driven question generation model for educational content,” 06 2018, pp. 1–10. [8] Duyu Tang, Nan Duan, Tao Qin, and Ming Zhou, “Question answering and question generation as dual tasks,” CoRR, vol. abs/1706.02027, 2017. [9] Mike Lewis and Angela Fan, “Generative question answering: Learning to answer the whole question,” in International Conference on Learning Representations, 2019. [10] Adams Wei Yu, David Dohan, Thang Luong, Rui Zhao, Kai Chen, and Quoc Le,“Qanet: Combining local convolution with global self-attention for reading comprehension,” 2018. [11] Shiyue Zhang and Mohit Bansal, “Addressing semantic drift in question generation for semi-supervised question answering,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, Nov. 2019, pp. 249 –2509, Association for Computational Linguistics. [12] Vinod Nair and Geoffrey E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on Inter-national Conference on Machine Learning, USA, 2010, ICML’10, pp. 807–814, Omnipress. [13] John Duchi, Elad Hazan, and Yoram Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research, vol. 12, no. Jul, pp. 2121–2159, 2011. [14] Diederik P. Kingma and Jimmy Ba, “Adam: A method for stochastic optimization,” 2014, cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015. [15] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” Trans. Neur. Netw., vol. 5, no. 2, pp. 157–166, Mar. 1994. [16] Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio, “On the difficulty of training recurrent neural networks,” in Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28. 2013, ICML’13, pp. III–1310–III–1318, JMLR.org. [17] Sepp Hochreiter and J¨urgen Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [18] Kyunghyun Cho, Bart van Merri¨enboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, “Learning phrase representations using RNN encoder–decoder for statistical machine translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, Oct. 2014, pp. 1724–1734, Association for Computational Linguistics. [19] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio, “Neural machine translation by jointly learning to align and translate,” 2014, cite arxiv:1409.0473Comment: Accepted at ICLR 2015 as oral presentation. [20] Minh-Thang Luong, Hieu Pham, and Christopher D. Manning, “Effective approaches to attention-based neural machine translation,” CoRR, vol. abs/1508.04025, 2015. [21] Ilya Sutskever, Oriol Vinyals, and Quoc V Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M.Welling, C. Cortes, N. D. Lawrence, and K. Q.Weinberger, Eds., pp. 3104–3112. Curran Associates, Inc., 2014. [22] Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer, “Scheduled sampling for sequence prediction with recurrent neural networks,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, Cambridge, MA, USA, 2015, NIPS’15, pp. 1171–1179, MIT Press. [23] Anirudh Goyal, Alex Lamb, Ying Zhang, Saizheng Zhang, Aaron Courville, and Yoshua Bengio, “Professor forcing: A new algorithm for training recurrent networks,” in Proceedings of the 30th International Conference on Neural Information Processing Systems, USA, 2016, NIPS’16, pp. 4608–4616, Curran Associates Inc. [24] Yaser Keneshloo, Tian Shi, Naren Ramakrishnan, and Chandan K. Reddy, “Deep reinforcement learning for sequence to sequence models,” CoRR, vol. abs/1805.09461, 2018. [25] Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, and Dan Jurafsky, “Deep reinforcement learning for dialogue generation,” CoRR, vol. abs/1606.01541, 2016. [26] Ankur Bapna, Mia Chen, Orhan Firat, Yuan Cao, and YonghuiWu, “Training deeper neural machine translation models with transparent attention,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, Oct.-Nov. 2018, pp. 3028–3033, Association for Computational Linguistics. [27] Ngoc-Quan Pham, German Kruszewski, and Gemma Boleda, “Convolutional neural network language models,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, Nov. 2016, pp. 1153– 1162, Association for Computational Linguistics. [28] Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin, “Convolutional sequence to sequence learning,” CoRR, vol. abs/1705.03122, 2017. [29] Fisher Yu and Vladlen Koltun, “Multi-scale context aggregation by dilated convolutions,” in International Conference on Learning Representations (ICLR), May 2016. [30] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani, “Self-attention with relative position representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), New Orleans, Louisiana, June 2018, pp. 464–468, Association for Computational Linguistics. [31] Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang, “Disan: Directional self-attention network for rnn/cnn-free language understanding,” CoRR, vol. abs/1709.04696, 2017. [32] Jeffrey Pennington, Richard Socher, and Christopher D Manning, “Glove: Global vectors for word representation.,” in EMNLP, 2014, vol. 14, pp. 1532–1543. [33] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov, “Enriching word vectors with subword information,” arXiv preprint arXiv:1607.04606, 2016. [34] Alec Radford, “Improving language understanding by generative pre-training,” 2018. [35] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, “BERT: Pretraining of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, June 2019, pp. 4171–4186, Association for Computational Linguistics. [36] Prajit Ramachandran, Peter Liu, and Quoc Le, “Unsupervised pretraining for sequence to sequence learning,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, Sept. 2017, pp. 383–391, Association for Computational Linguistics. [37] Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever, “Language models are unsupervised multitask learners,” 2019. [38] Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, and Hsiao-Wuen Hon, “Unified language model pre-training for natural language understanding and generation,” CoRR, vol. abs/1905.03197, 2019. [39] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang, “SQuAD: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, Nov. 2016, pp. 2383–2392, Association for Computational Linguistics. [40] Adam Trischler, TongWang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman, “NewsQA: A machine comprehension dataset,” in Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada, Aug. 2017, pp. 191–200, Association for Computational Linguistics. [41] Siva Reddy, Danqi Chen, and Christopher D. Manning, “Coqa: A conversational question answering challenge,” CoRR, vol. abs/1808.07042, 2018. [42] Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng, “Ms marco: A human generated machine reading comprehension dataset.,” CoRR, vol. abs/1611.09268, 2016. [43] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, July 2002, pp. 311–318, Association for Computational Linguistics. [44] Satanjeev Banerjee and Alon Lavie, “METEOR: An automatic metric for MT evaluation with improved correlation with human judgments,” in Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, Michigan, June 2005, pp. 65–72, Association for Computational Linguistics. [45] Chin-Yew Lin, “ROUGE: A package for automatic evaluation of summaries,” in Text Summarization Branches Out, Barcelona, Spain, July 2004, pp. 74–81, Association for Computational Linguistics. [46] Michael Heilman and Noah A. Smith, “Good question! statistical ranking for question generation,” in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Stroudsburg, PA, USA, 2010, HLT ’10, pp. 609–617, Association for Computational Linguistics. [47] Andrew Olney, Arthur Graesser, and Natalie Person, “Question generation from concept maps,” Dialogue Discourse, vol. 3, 03 2012. [48] Abigail See, Peter J. Liu, and Christopher D. Manning, “Get to the point: Summarization with pointer-generator networks,” CoRR, vol. abs/1704.04368, 2017. [49] Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly, “Pointer networks,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, Cambridge, MA, USA, 2015, NIPS’15, pp. 2692–2700, MIT Press. [50] Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, and Ming Zhou, “Neural question generation from text: A preliminary study,” CoRR, vol. abs/1704.01792, 2017. [51] Ryan Kiros, Yukun Zhu, Russ R Salakhutdinov, Richard Zemel, Raquel Urtasun, Antonio Torralba, and Sanja Fidler, “Skip-thought vectors,” in Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds., pp. 3294–3302. Curran Associates, Inc., 2015. [52] Lajanugen Logeswaran and Honglak Lee, “An efficient framework for learning sentence representations,” in International Conference on Learning Representations, 2018. [53] Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi, “Bertscore: Evaluating text generation with BERT,” CoRR, vol. abs/1904.09675, 2019. [54] Ashwin K. Vijayakumar, Michael Cogswell, Ramprasaath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra, “Diverse beam search: Decoding diverse solutions from neural sequence models,” CoRR, vol. abs/1610.02424, 2016. [55] Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan, “A diversitypromoting objective function for neural conversation models,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, June 2016, pp. 110–119, Association for Computational Linguistics. [56] Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton, “Layer normalization,” ArXiv, vol. abs/1607.06450, 2016. [57] Dan Hendrycks and Kevin Gimpel, “Bridging nonlinearities and stochastic regularizers with gaussian error linear units,” CoRR, vol. abs/1606.08415, 2016. [58] Sergey Ioffe and Christian Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift.,” CoRR, vol. abs/1502.03167, 2015. [59] Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, and Qifa Ke, “Paragraph-level neural question generation with maxout pointer and gated self-attention networks,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, Oct.-Nov. 2018, pp. 3901–3910, Association for Computational Linguistics. [60] Daniel Cer, Yinfei Yang, Sheng yi Kong, Nan Hua, Nicole Lyn Untalan Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-C´espedes, Steve Yuan, Chris Tar, Yun hsuan Sung, Brian Strope, and Ray Kurzweil, “Universal sentence encoder,” in In submission to: EMNLP demonstration, Brussels, Belgium, 2018, In submission. [61] Yinfei Yang, Daniel Cer, Amin Ahmad, Mandy Guo, Jax Law, Noah Constant, Gustavo Herna´ndez A´ brego, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, and Ray Kurzweil, “Multilingual universal sentence encoder for semantic retrieval,” CoRR, vol. abs/1907.04307, 2019. [62] Muthuraman Chidambaram, Yinfei Yang, Daniel Cer, Steve Yuan, Yun-Hsuan Sung, Brian Strope, and Ray Kurzweil, “Learning cross-lingual sentence representations via a multi-task dual-encoder model,” CoRR, vol. abs/1810.12836, 2018. [63] Jenny Rose Finkel, Christopher D. Manning, and Andrew Y. Ng, “Solving the problem of cascading errors: Approximate Bayesian inference for linguistic annotation pipelines,” in Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006, pp. 618–626, Association for Computational Linguistics. [64] Kevin Gimpel, Dhruv Batra, Chris Dyer, and Gregory Shakhnarovich, “A systematic exploration of diversity in machine translation,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, Washington, USA, Oct. 2013, pp. 1100–1111, Association for Computational Linguistics. [65] Jiwei Li and Dan Jurafsky, “Mutual information and diverse decoding improve neural machine translation,” CoRR, vol. abs/1601.00372, 2016. [66] Ashwin K. Vijayakumar, Michael Cogswell, Ramprasaath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra, “Diverse beam search: Decoding diverse solutions from neural sequence models.,” CoRR, vol. abs/1610.02424, 2016. [67] L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web,” in Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, 1998, pp. 161–172. [68] Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, and Tie-Yan Liu, “MASS: masked sequence to sequence pre-training for language generation,” CoRR, vol. abs/1905.02450, 2019. [69] Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and LukeZettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv:1910.13461, 2019. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52042 | - |
| dc.description.abstract | 大量訊息的傳遞在現今社會中已經成了日常生活的一部份,人們透過網路蒐集、散播各種知識與技術,在此前提下,大量資料的取得變得更加容易,如何利用未標記資料增進模型效能的做法成為這幾年大家關注的焦點。未標記資料預訓練加上標記資料微調已經成了這幾年自然語言處理領域的標準配備,尤其在自然語言生成領域此方法帶來了十足的進步,使得利用自然語言生成的下游任務成為可能。本論文嘗試利用問題生成的方式實現機器閱讀理解的半監督式學習與轉移學習,然而透過模型生成機器閱讀理解的標記資料較分類任務困難許多,原因是分類任務的標記多存在唯一且固定數量的種類,然機器閱讀理解需同時針對文章生成答案與問題,由於問題是自然語言型式因此存在無限多種可能結果,且答案與問題並不是簡單的一對一對應,這些都為閱讀理解資料的生成帶來了相當大的困難。
本論文首先探討使用預訓練模型進行問題生成的情況下,不同特徵以及不同問題選擇方式下對模型表現的影響,並發現預訓練模型在問題生成的品質以及問題的可回答性方面都帶來了大幅的進步,使得使用問題生成產生資料成為可能。再來是將問題生成應用於機器閱讀理解的半監督式學習,希望機器可以透過自問自答的方式在大量文章上生成更多的訓練資料,我們比較了兩種常見的答案生成方式以及不同生成資料數量對問答模型表現的影響,並發現此方法確實能讓問答模型的表現更進一步。最後是將此方法用於不同領域的問答集上,希望模型能在一個資料集上學習發問的技巧,當遇到領域相差較大的情況時能透過在該領域自我學習的方式提升模型的表現,結果也顯示了此方法應用於不同資料集上也能帶來一定程度的進步。 | zh_TW |
| dc.description.abstract | The transmission of a large amount of information has become a part of daily life in today's society. People collect and disseminate various knowledge and technologies through the Internet. Under this premise, the acquisition of large amounts of data becomes easier. How to use unlabeled data to improve the performance of model has come into the focus in recent years. Unlabeled data pre-training and labeled data fine-tuning have become standard equipment in the field of natural language processing recently. Especially in the field of natural language generation, this method has brought great progress, making it possible to use natural language generation in downstream tasks. This thesis attempts to use question generation to implement semi-supervised learning and transfer learning for machine reading comprehension. However, it is much more difficult to generate labeled data through models in machine reading comprehension than in classification tasks. The reason is that there are often unique and fixed numbers of labels for classification tasks. But, in machine reading comprehension, answers and questions for the article need to be generated at the same time. Furthermore, there are an infinite variety of possible results for natural language questions which have the same semantic, and the answer and the question are not a simple one-to-one mapping. All the above bring considerable difficulties to the reading comprehension data generation. In this thesis, we first discussed the impact of different features and different question selection methods on the performance of the model when using the pre-training model for question generation, and found that the pre-training model brings about the quality and the answerability of generated questions. Significant progress has made it possible to generate data using question generation. Second, we applied question generation to semi-supervised learning on machine reading comprehension. We hope that machines could generate more training data on a large number of articles through self-questioning and self-answering methods. We compared two common answer generation methods and different amounts of generated data, and found that this method can indeed make the performance of the question answering model further. Finally, we applied the same data generation methods to question answering datasets in different fields. We hoped that the model can learn the skills of asking questions on previous datasets and use self-questioning and self-answering methods to improve the performance of the question answering model on different datasets. The results also show that this method can also bring a certain degree of progress when applied to different datasets. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-15T14:04:52Z (GMT). No. of bitstreams: 1 U0001-0708202016225500.pdf: 5508775 bytes, checksum: 3b73e89c1f7c89d4ccbe24be4a678099 (MD5) Previous issue date: 2020 | en |
| dc.description.tableofcontents | 致謝 i 中文摘要 iii 一、導論 1 1.1 研究動機 1 1.2 相關研究 3 1.2.1 問題生成 3 1.2.2 問答系統增加訓練資料得方法 4 1.3 研究方向與主要貢獻 4 1.4 章節安排 5 二、背景知識 6 2.1 類神經網路 6 2.1.1 簡介 6 2.1.2 訓練方法 8 2.1.3 遞歸式神經網路(Recurrent Neural Network, RNN) 11 2.1.4 注意力類神經網路(Attention Neural Network, ANN) 15 2.2 序列到序列模型 16 2.2.1 簡介 16 2.2.2 基本型序列到序列模型 18 2.2.3 注意力序列到序列模型 19 2.2.4 轉換器模型 20 2.3 非監督式預訓練 23 2.3.1 詞嵌入(Word Embedding) 24 2.3.2 考慮上下文的詞嵌入(Contextualized Word Embedding)27 2.3.3 生成任務的預訓練 29 2.4 文字問答系統 30 2.4.1 背景介紹 30 2.4.2 機器閱讀理解 31 2.4.3 答案型式 31 2.4.4 評估方式 32 2.5 自動問題生成 33 2.5.1 背景介紹 33 2.5.2 閱讀理解的問題生成 34 2.5.3 評估方式 36 2.6 本章總結 39 三、將預訓練模型用於問題生成 41 3.1 簡介 41 3.2 模型架構介紹 41 3.2.1 次詞單位 41 3.2.2 網路架構 43 3.3 評估方式 47 3.3.1 句子相似性 47 3.3.2 流暢度 53 3.3.3 可回答性 54 3.4 實驗設計與結果 55 3.4.1 資料介紹 55 3.4.2 訓練方法 56 3.4.3 推論方法 57 3.4.4 實驗一:特徵選擇 58 3.4.5 實驗二:單向GPT2與雙向GPT2之比較 76 3.4.6 實驗三:不同評估方式之相關性 80 3.5 本章總結 82 四、以問題生成實現機器閱讀理解之半監督式學習 83 4.1 簡介 83 4.2 使用人工標記答案 86 4.2.1 資料介紹 86 4.2.2 模型 86 4.2.3 評估方式 87 4.2.4 實驗介紹 87 4.2.5 問答模型、問題語言模型、問題生成模型的準備 88 4.2.6 實驗一:問題選擇方式的比較 89 4.2.7 實驗二:大量生成資料與資料過濾 93 4.2.8 實驗三:問題多樣性 94 4.2.9 結論 102 4.3 使用機器生成資料 103 4.3.1 資料介紹 103 4.3.2 答案生成 104 4.3.3 實驗一:不同答案生成方式的比較 107 4.3.4 實驗二:使用維基百科的文章生成資料 109 4.3.5 結論 117 4.4 本章總結 117 五、以問題生成實現機器閱讀理解之轉移學習 119 5.1 簡介 119 5.2 資料介紹 119 5.3 實驗流程 121 5.4 評估方式 121 5.5 實驗結果 122 5.5.1 文章切割的NewsQA 122 5.5.2 原始文章長度的NewsQA 123 5.6 本章總結 124 六、結論與展望 126 6.1 研究貢獻與討論 126 6.2 未來展望 127 6.2.1 使用更好的預訓練模型 127 6.2.2 問題的生成與選擇 127 5.2.3 可控制的問題生成 128 6.2.4 問答模型與問題生成模型的交互訓練 128 6.2.5 更困難的資料集 129 參考文獻 130 | |
| dc.language.iso | zh-TW | |
| dc.subject | 轉移學習 | zh_TW |
| dc.subject | 問題生成 | zh_TW |
| dc.subject | 半監督式學習 | zh_TW |
| dc.subject | Question Generation | en |
| dc.subject | Semi-Supervised Learning | en |
| dc.subject | Transfer Learning | en |
| dc.title | 以自動問題生成實現機器閱讀理解之半監督式學習與轉移學習 | zh_TW |
| dc.title | Explore the use of Automatic Question Generation in Semi-supervised learning and Transfer learning on Machine Reading Comprehension | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 108-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 李琳山(Lin-Shan Lee),鄭秋豫(Chiu-Yu TSENG),王小川(Hsiao-Chuan Wang),陳信宏(Sin-Horng Chen),簡仁宗(Jen-Tzung Chien) | |
| dc.subject.keyword | 問題生成,半監督式學習,轉移學習, | zh_TW |
| dc.subject.keyword | Question Generation,Semi-Supervised Learning,Transfer Learning, | en |
| dc.relation.page | 140 | |
| dc.identifier.doi | 10.6342/NTU202002649 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2020-08-12 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
| 顯示於系所單位: | 電機工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-0708202016225500.pdf 未授權公開取用 | 5.38 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
