應用RoBERTa-wwm預訓練模型與集成學習以增強機器閱讀理解之表現

Hsien-Ting Huang; 黃獻霆

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81658

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊裕澤(Yuh-Jzer Joung)
dc.contributor.author	Hsien-Ting Huang	en
dc.contributor.author	黃獻霆	zh_TW
dc.date.accessioned	2022-11-24T09:25:23Z	-
dc.date.available	2022-11-24T09:25:23Z	-
dc.date.copyright	2021-08-06
dc.date.issued	2021
dc.date.submitted	2021-07-22
dc.identifier.citation	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010. Changchang Zeng, Shaobo Li, Qin Li, Jie Hu, Jianjun Hu: “A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets”, 2020; [http://arxiv.org/abs/2006.11880 arXiv:2006.11880]. Chih Chieh Shao, Trois Liu, Yuting Lai, Yiying Tseng, Sam Tsai: “DRCD: a Chinese Machine Reading Comprehension Dataset”, 2018; [http://arxiv.org/abs/1806.00920 arXiv:1806.00920]. Chujie Zheng, Minlie Huang, Aixin Sun: “ChID: A Large-scale Chinese IDiom Dataset for Cloze Test”, 2019; [http://arxiv.org/abs/1906.01265 arXiv:1906.01265]. Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio: “Neural Machine Translation by Jointly Learning to Align and Translate”, 2014; [http://arxiv.org/abs/1409.0473 arXiv:1409.0473]. Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, Eduard Hovy: “RACE: Large-scale ReAding Comprehension Dataset From Examinations”, 2017; Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics: Copenhagen, Denmark, 2017; pp. 785–794. doi:10.18653/v1/D17-1082. [http://arxiv.org/abs/1704.04683 arXiv:1704.04683]. Ilya Sutskever, Oriol Vinyals, Quoc V. Le, 2014, 'Sequence to Sequence Learning with Neural Networks,' pp. 3104–311 in NIPS 2014 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. 2020. Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension. Transactions of the Association for Computational Linguistics 8 (2020), 141–155. Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie: “DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension”, 2019; [http://arxiv.org/abs/1902.00164 arXiv:1902.00164]. Liang Xu, Hai Hu, Xuanwei Zhang, Lu Li, Chenjie Cao, Yudong Li, Yechen Xu, Kai Sun, Dian Yu, Cong Yu, Yin Tian, Qianqian Dong, Weitang Liu, Bo Shi, Yiming Cui, Junyi Li, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li, Yina Patterson, Zuoyu Tian, Yiwen Zhang, He Zhou, Shaoweihua Liu, Zhe Zhao, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Kyle Richardson, Zhenzhong Lan: “CLUE: A Chinese Language Understanding Evaluation Benchmark”, 2020; [http://arxiv.org/abs/2004.05986 arXiv:2004.05986]. Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil: “Frustratingly Easy Natural Question Answering”, 2019; [http://arxiv.org/abs/1909.05286 arXiv:1909.05286]. Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. LSTM neural networks for language modeling. In INTERSPEECH, pages 194–197, 2012. Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, Bryan Catanzaro: “Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism”, 2019; [http://arxiv.org/abs/1909.08053 arXiv:1909.08053]. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Ques- tions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2383–2392. Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know What You Don’t Know: Unanswer- able Questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 784–789. Qizhe Xie, Guokun Lai, Zihang Dai, Eduard Hovy: “Large-scale Cloze Test Dataset Created by Teachers”, 2017; [http://arxiv.org/abs/1711.03225 arXiv:1711.03225]. Shan, Changhao, Junbo Zhang, Yujun Wang, and Lei Xie. 'Attention-Based End-to-End Speech Recognition on Voice Search.' 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018. doi:10.1109/icassp.2018.8462492. Tianqi Chen, Carlos Guestrin: “XGBoost: A Scalable Tree Boosting System”, 2016; [http://arxiv.org/abs/1603.02754 arXiv:1603.02754]. DOI: [https://dx.doi.org/10.1145/2939672.2939785 10.1145/2939672.2939785]. Tomáš Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean: “Efficient Estimation of Word Representations in Vector Space”, 2013; [http://arxiv.org/abs/1301.3781 arXiv:1301.3781]. Tomáš Mikolov, Martin Karafit, Lukáš Burget, Jan Cernock, and Sanjeev Khudanpur, “Recurrent neural network based language model,” in Proc. on InterSpeech, 2010. Wu Cheng Xuan, “Multilingual Machine Reading Comprehension based on BERT Model”, 2019; Master thesis, Dept. of Computer Science Information Engineering, National Taipei University of Technology. Retrieved from https://hdl.handle.net/11296/aua9d5. Yi-Chih Huang, Yu-Lun Hsieh, Yi-Yu Lin, Teo Lin Hui, Hung-Ying Chu, Wen-Lian Hsu: “FLUD: Expert-curated Large-scale Machine Comprehension Dataset with Advanced Reasoning Strategies”, 2021; Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu: “Revisiting Pre-Trained Models for Chinese Natural Language Processing”, 2020; [http://arxiv.org/abs/2004.13922 arXiv:2004.13922]. DOI: [https://dx.doi.org/10.18653/v1/2020.findings-emnlp.58 10.18653/v1/2020.findings-emnlp.58]. Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu: “Pre-Training with Whole Word Masking for Chinese BERT”, 2019; [http://arxiv.org/abs/1906.08101 arXiv:1906.08101]. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov: “RoBERTa: A Robustly Optimized BERT Pretraining Approach”, 2019; [http://arxiv.org/abs/1907.11692 arXiv:1907.11692]. Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Jauvin. A neural probabilistic language model. Journal of Machine Learning Research (JMLR), 3:1137–1155, 2003. Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu: “ERNIE: Enhanced Representation through Knowledge Integration”, 2019; [http://arxiv.org/abs/1904.09223 arXiv:1904.09223]. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. CoRR abs/1909.11942 (2019) Yang, Zhilin, Dai, Zihang, Yang, Yiming, Carbonell, Jaime G., Salakhutdinov, Ruslan and Le, Quoc V.. 'XLNet: Generalized Autoregressive Pretraining for Language Understanding..' CoRR abs/1906.08237 (2019): .
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81658	-
dc.description.abstract	我們所實驗的資料集為Formosa Language Understanding Dataset (FLUD)，資料來源由國家實驗研究院科技政策研究與資訊中心所提供。過往針對FLUD所做的研究包括基於BERT模型之多國語言機器閱讀理解研究(Wu. 2019)以及科政中心與科技部主辦的科技大擂台。我們所實驗的機器閱讀理解任務為繁體中文的閱讀測驗選擇題。縱使目前針對繁體中文的資料集與研究相較其他語言如簡體中文、英文來的不足，我們目前可以將繁體中文轉為簡體中文，並運用簡體中文的預訓練模型，我們所使用的預訓練模型BERT-wwm-ext-base與RoBERTa-wwm-ext-base進行機器閱讀理解下游任務已經成功超越過往的研究實驗結果，再者，我們提出使用簡體中文的輔助資料集來幫助訓練，並運用多個模型進行集成學習來提昇最後的預測結果，輔助資料集大大的提高了模型的實驗表現，而集成學習也成功在多個模型的預測結果中小幅的提升了模型預測結果，我們認為集成學習在激烈的競賽當中會是一個很好的技巧；最後，我們重現了當時科技大擂台競賽的規則，將決賽的語音檔透過語音轉文字，並運用我們所訓練完的模型進行預測，也小幅的超越過往研究實驗成果的模型表現，我們的實驗結果發現語意理解與語音轉文字是在進行此實驗中最大的兩個障礙，因此，針對未來在做相關的機器閱讀理解任務，我們建議研究上可以聚焦於上述提到的兩個因素。	zh_TW
dc.description.provenance	Made available in DSpace on 2022-11-24T09:25:23Z (GMT). No. of bitstreams: 1 U0001-2107202108263000.pdf: 1826364 bytes, checksum: 7bad76ab991fb7b1c68ced00765e1600 (MD5) Previous issue date: 2021	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 摘要 iii Abstract iv List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Related Work 4 2.1 Language Model 4 2.2 Machine Reading Comprehension 5 2.2.1 MRC task 5 2.2.2 MRC Multiple Choice Datasets 6 2.3 Mandarin Pretrained Model 7 Chapter 3 Methodology 9 3.1 Problem Definition 9 3.2 Pretrained Models 9 3.2.1 Pretrained model selection 9 3.2.2 Existing pretrained models 10 3.3 Dataset 11 3.4 Proposed Methodology 13 3.4.1 Learning with Auxiliary Dataset 13 3.4.2 Ensemble Model 14 Chapter 4 Experiment 16 4.1 Datasets and Evaluation 16 4.2 Compared Pretrained Models 17 4.3 Experiment Settings 18 4.4 Experiment Results 20 4.4.1 Experiment on the Formosa Language Understanding Dataset (FLUD) 20 4.4.2 Learning with auxiliary dataset 22 4.4.3 Ensemble learning with different trained models 24 4.5 Experiment Results on Formosa Grand Challenge 27 4.5.1 Exploratory Data Analysis of the Datasets on Formosa Grand Challenge 27 4.5.2 Learning with auxiliary dataset on Formosa Grand Challenge 29 4.5.3 Ensemble learning with different trained models on Formosa Grand Challenge 31 4.5.4 Analysis on experiment results of Formosa Grand Challeng 37 Chapter 5 Conclusion 42 References 45
dc.language.iso	en
dc.subject	中文語音轉文字	zh_TW
dc.subject	機器閱讀理解	zh_TW
dc.subject	自然語言處理	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	輔助資料集	zh_TW
dc.subject	集成學習	zh_TW
dc.subject	Natural Language Processing	en
dc.subject	Speech-to-Text	en
dc.subject	Ensemble Learning	en
dc.subject	Auxiliary Dataset	en
dc.subject	Deep Learning	en
dc.subject	Machine Reading Comprehension	en
dc.title	應用RoBERTa-wwm預訓練模型與集成學習以增強機器閱讀理解之表現	zh_TW
dc.title	Machine Question Answering Based on RoBERTa-wwm Pre-trained Model and Ensemble Learning	en
dc.date.schoolyear	109-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳建錦(Hsin-Tsai Liu),盧信銘(Chih-Yang Tseng),王新民
dc.subject.keyword	機器閱讀理解,自然語言處理,深度學習,輔助資料集,集成學習,中文語音轉文字,	zh_TW
dc.subject.keyword	Machine Reading Comprehension,Natural Language Processing,Deep Learning,Auxiliary Dataset,Ensemble Learning,Speech-to-Text,	en
dc.relation.page	48
dc.identifier.doi	10.6342/NTU202101619
dc.rights.note	未授權
dc.date.accepted	2021-07-22
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
U0001-2107202108263000.pdf 未授權公開取用	1.78 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。