使用轉換學習方法進行少量資源中文機器閱讀理解

Chen-Yu Chien; 簡辰宇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49054

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	魏志平(Chih-Ping Wei)
dc.contributor.author	Chen-Yu Chien	en
dc.contributor.author	簡辰宇	zh_TW
dc.date.accessioned	2021-06-15T11:14:45Z	-
dc.date.available	2020-08-24
dc.date.copyright	2020-08-24
dc.date.issued	2020
dc.date.submitted	2020-08-14
dc.identifier.citation	Arjovsky, M., Chintala, S., and Bottou, L. (2017). Wasserstein GAN. arXiv preprint arXiv:1701.07875. Bajaj, P., Campos, D., Craswell, N., Deng, L., Gao, J., Liu, X., Majumder, R., McNamara, A., Mitra, B., Nguyen, T., et al. (2016). Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268. Beltagy, I., Lo, K., and Cohan, A. (2019). SciBERT: a pretrained language model for scientific text. arXiv preprint arXiv:1903.10676. Chung, Y. A., Lee, H. Y., and Glass, J. (2017). Supervised and unsupervised transfer learning for question answering. arXiv preprint arXiv:1711.05345. Conneau, A. and Lample, G. (2019). Cross-lingual language model pretraining. In Advances in Neural Information Processing Systems, pages 7059–7069. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., and Hu, G. (2019a). Cross-lingual machine reading comprehension. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1586–1595. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., and Hu, G. (2020). Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922. Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z., Wang, S., and Hu, G. (2019b). Pre-training with whole word masking for Chinese BERT. arXiv preprint arXiv:1906.08101. Cui, Y., Liu, T., Che, W., Xiao, L., Chen, Z., Ma, W., Wang, S., and Hu, G. (2018). A span-extraction dataset for Chinese machine reading comprehension. arXiv preprint arXiv:1810.07366. Cui, Y., Liu, T., Chen, Z., Wang, S., and Hu, G. (2016). Consensus attention-based neural networks for chinese reading comprehension. arXiv preprint arXiv:1607.02250. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Gers, F. A., Schmidhuber, J., and Cummins, F. (1999). Learning to forget: Continual prediction with LSTM. In Proceedings of 1999 Ninth International Conference on Artificial Neural Networks Networks, volume 2, pages 850–855. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. (2017). Improved training of wasserstein gans. In Advances in Neural Information Processing Systems, pages 5767–5777. Hermann, K. M., Kocˇisky´, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., and Blunsom, P. (2015). Teaching machines to read and comprehend. In Proceedings of the 28th International Conference on Neural Information Processing Systems, volume 1, pages 1693–1701. Huang, Z., Xu, W., and Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991. Jin, Q., Dhingra, B., Liu, Z., Cohen, W. W., and Lu, X. (2019). PubMedQA: a dataset for biomedical research question answering. arXiv preprint arXiv:1909.06146. Joshi, M., Chen, D., Liu, Y., Weld, D. S., Zettlemoyer, L., and Levy, O. (2020). SpanBERT: improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics, 8:64–77. Lai, G., Xie, Q., Liu, H., Yang, Y., and Hovy, E. (2017). RACE: Large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683. Lan, Z. Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). ALBERT: a lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1901.07291. Lee, C. H. and Lee, H. Y. (2019). Cross-lingual transfer learning for question answering. arXiv preprint arXiv:1907.06042. Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., and Kang, J. (2020). BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240. Liu, S., Zhang, X., Zhang, S., Wang, H., and Zhang, W. (2019a). Neural machine reading comprehension: Methods and trends. Applied Sciences, 9(18):3698. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019b). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. Mikolov, T., Le, Q. V., and Sutskever, I. (2013a). Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119. Rajpurkar, P., Jia, R., and Liang, P. (2018). Know what you don’t know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822. Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. (2016). SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250. Shao, C. C., Liu, T., Lai, Y., Tseng, Y., and Tsai, S. (2018). DRCD: a Chinese machine reading comprehension dataset. arXiv preprint arXiv:1806.00920. Sun, C., Qiu, X., Xu, Y., and Huang, X. (2019). How to fine-tune BERT for text classification? In China National Conference on Chinese Computational Linguistics, pages 194–206. Springer. Suster, S. and Daelemans, W. (2018). CliCR: a dataset of clinical case reports for machine reading comprehension. arXiv preprint arXiv:1803.09720. Tsatsaronis, G., Balikas, G., Malakasiotis, P., Partalas, I., Zschunke, M., Alvers, M. R., Weissenborn, D., Krithara, A., Petridis, S., Polychronopoulos, D., Almirantis, Y., Pavlopoulos, J., Baskiotis, N., Gallinari, P., Artie´res, T., Ngomo, A. C. N., Heino, N., Gaussier, E., Barrio-Alvers, L., Schroeder, M., Androutsopoulos, I., and Paliouras, G. (2015). An overview of the bioasq large-scale biomedical semantic indexing and question answering competition. BMC Bioinformatics, 16(1):138. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems, pages 5998–6008. Wang, H., Gan, Z., Liu, X., Liu, J., Gao, J., and Wang, H. (2019). Adversarial domain adaptation for machine reading comprehension. arXiv preprint arXiv:1908.09209. Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., et al. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. Xu, L., Zhang, X., Li, L., Hu, H., Cao, C., Liu, W., Li, J., Li, Y., Sun, K., Xu, Y., et al. (2020). CLUE: A Chinese language understanding evaluation benchmark. arXiv preprint arXiv:2004.05986. Xu, Y., Liu, X., Shen, Y., Liu, J., and Gao, J. (2018). Multi-task learning with sample re-weighting for machine reading comprehension. arXiv preprint arXiv:1809.06963. Yang, W., Lu, W., and Zheng, V. (2017). A simple regularization-based algorithm for learning cross-domain word embeddings. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2898–2904. Association for Computational Linguistics. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., and Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems, pages 5753–5763. Zhang, Z., Yang, J., and Zhao, H. (2020). Retrospective reader for machine reading comprehension. arXiv preprint arXiv:2001.09694.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49054	-
dc.description.abstract	隨著語言模型預訓練技術的巨大進展，機器閱讀理解(MRC)研究有了顯著的進步，近年來深受重視。由於相關訓練資源較為充足，目前MRC的研究大多集中在英語的一般性應用領域。然而在少量訓練資源領域的MRC研究則較缺乏，但對許多特定領域(例如醫療保健/生物醫學應用)這類的需求其實很高。此外，針對中文MRC的研究也正在起步。本研究的主要目的是針對少量資源機器閱讀理解發展出一種可行的解決方案。並提出一種基於預訓練/微調語言建模結構的轉換學習方法，其中包括四種不同的微調方法：聯合(Joint)、依序(Cascade)、對抗式學習(Adversarial learning)及多任務學習(Multitask learning)訓練。這些微調方法是運用資源較為豐富的其他領域訓練數據來提高機器學習的成效。同時，本研究也進行了多種實驗來評估在不同條件下的成效，包括輔助訓練任務支持程度的不同、目標數據的大小變化、及使用不同的中文MRC評測標準進行測試。本研究結果證實，從資源相對較為豐富的領域取得數據，可以協助提高資源較為稀少的特定領域其機器學習成效；同時採用多任務訓練方法進行精細化處理，則可以最大化輔助數據集對於少量資源機器閱讀理解的支持，並可達到最佳的機器閱讀理解效能。	zh_TW
dc.description.abstract	With huge advances in language model pre-training technology, machine reading comprehension (MRC) has been significantly improved and attracted a lot of research attention in recent years. Most research on MRC has focused on general-domain contexts in English, which relies on the adequacy of the training sources. MRC research in low-resource domains has not yet been well developed, but is highly demanded for many specific domains such as healthcare/biomedical applications. Besides, there is less research conducted in Chinese MRC. The purpose of this research is to develop a feasible solution for low-resource machine reading comprehension. In this research, we employ the transfer learning approach which is based on a pre-training/fine-tuning language modeling structure , and develop four different domain adaptation methods, including joint, cascade, adversarial, and multitask learning. The domain adaptation methods are employed to improve the effectiveness of MRC through the support of the training data from other domains with high resources. Various experiments are conducted to evaluate the effectiveness of different domain adaptation methods under various conditions, including different scales of support from the auxiliary tasks, limited sizes of target data, and tests in different Chinese MRC benchmarks. According to our evaluation results, the effectiveness of MRC in specific domains with low training resources can be improved through the support of the data from other domains with high resources, and the multitask training method for fine-tuning generally can maximize the support of the auxiliary data sets to low-resource machine reading comprehension tasks and achieve the best effectiveness of MRC.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T11:14:45Z (GMT). No. of bitstreams: 1 U0001-1208202023281800.pdf: 2625768 bytes, checksum: f28f5f607a3b0760cfad9137cb46118e (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	誌謝 ii 摘要 iii Abstract iv List of Figures viii List of Tables x Chapter 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2 Literature Review 6 2.1 Word Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Cross-Domain Word Representation . . . . . . . . . . . . . . . . . 7 2.2 Pre-trained Language Model . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Enhanced Pre-trained Language Model . . . . . . . . . . . . . . . 8 2.2.2 Domain Specific Pre-trained Language Model . . . . . . . . . . . 9 2.3 Machine Reading Comprehension . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Machine Reading Comprehension on Specific Domains . . . . . . . . . . 12 2.4.1 Biomedical MRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.2 MRC in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.3 Domain Adaptation on MRC . . . . . . . . . . . . . . . . . . . . . 13 Chapter 3 Methodology 15 3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 NTUBI MRC Task and Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Pre-trained Language Model . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.1 Pre-trained Language Model Selection . . . . . . . . . . . . . . . . 17 3.4 Domain Adaptation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.1 Joint Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.2 Cascade Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4.3 Adversarial Training . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.4 Multitask Training . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Chapter 4 Experiments 25 4.1 Datasets and Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 Experiment Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.1 Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.2 Hyper Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2.3 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Results on Full-sized Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.3.1 Adversarial Training and Multitask Training . . . . . . . . . . . . 31 4.4 Effects of Different Support Scale from the Auxiliary Task . . . . . . . . . 32 4.5 Performance in Limited Target Size . . . . . . . . . . . . . . . . . . . . . . 34 4.5.1 Half of the Target Samples from NTUBI . . . . . . . . . . . . . . . 35 4.5.2 One Tenth of the Target Samples from NTUBI . . . . . . . . . . . 36 4.6 Evaluations on Other Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.6.1 Evaluations on Other Tasks with the Same Size of NTUBI . . . . 38 4.7 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.7.1 Data Amount Requirements . . . . . . . . . . . . . . . . . . . . . . 41 4.7.2 Effect from Domain Differences . . . . . . . . . . . . . . . . . . . 41 4.7.3 Recommended Occasion of Adopting Proposed Transfer Learning Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Chapter 5 Concluding Remarks and Further Work 43 References 45
dc.language.iso	en
dc.subject	領域適配	zh_TW
dc.subject	機器閱讀理解	zh_TW
dc.subject	轉換式學習	zh_TW
dc.subject	domain adaptation	en
dc.subject	transfer learning	en
dc.subject	machine reading comprehension	en
dc.title	使用轉換學習方法進行少量資源中文機器閱讀理解	zh_TW
dc.title	Low-Resource Chinese Machine Reading Comprehension Using A Transfer Learning Approach	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	吳家齊(Chia-Chi Wu),張詠淳(Yung-Chun Chang)
dc.subject.keyword	機器閱讀理解,轉換式學習,領域適配,	zh_TW
dc.subject.keyword	machine reading comprehension,transfer learning,domain adaptation,	en
dc.relation.page	50
dc.identifier.doi	10.6342/NTU202003171
dc.rights.note	有償授權
dc.date.accepted	2020-08-14
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
U0001-1208202023281800.pdf 未授權公開取用	2.56 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。