請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/63049| 標題: | 使用多語言語言表示模型進行跨語言遷移學習之問答系統 Cross-lingual Transfer Learning with Multi-lingual Language Representation Model on Question Answering |
| 作者: | Tsung-Yuan Hsu 許宗嫄 |
| 指導教授: | 李宏毅(Hung-Yi Lee) |
| 關鍵字: | 自然語言處理,自然語言理解,問答系統, NLP,NLU,QA, |
| 出版年 : | 2020 |
| 學位: | 碩士 |
| 摘要: | 以深層學習(Deep Learning)模型來建構問答系統(Question Answering System)是當今主流,然而面對不同語言、形式、主題等多樣化的問答任務,缺乏足夠的目標領域(Target Domain)標註資料是實際開發問答系統時常須面對的問題,而遷移學習(Transfer Learning)正是以其他領域標註資料來解決目標領域資料不足的方法。本論文聚焦於跨語言遷移學習(Cross-lingual Transfer Learning)任務,探討如何用單一語言大型問答資料集進行遷移學習,幫助提升問答系統在其他語言的表現。
針對問答系統的跨語言遷移學習任務,前人提出的方法高度仰賴成熟的機器翻譯系統來翻譯問答資料,然而另外訓練一套機器翻譯系統何嘗不須大量標註資料?因此本論文另闢蹊徑,借助自監督式學習(Self-supervised Learning)得到的多語言語言表示模型(Multi-lingual Language Representation Model)使得問答模型具備理解不同語言的能力,並成功單獨使用英文問答資料集訓練出可以進行中文問答的跨語言問答模型。 本論文進一步設計不同特性的人工問答資料集,來探究此方法成功將問答技巧跨語言遷移的原因,實驗結果發現多語言語言表示模型將不同語言相近意義的字詞以相近的向量表示,本論文稱之為語言表示的「語言無關」現象,發現此現象可能是語言表示模型預訓練資料裡少數語碼變換~(Code-switching)資料造成的,並探討此現象與下游任務(Downstream Task)跨語言遷移學習表現的關聯。 Deep-learning-based question answering systems are very powerful but also highly data-dependent. Broader applications of an existing system are often hindered by mismatches of data domains, such as languages, topics, etc.. However, collecting and annotating sufficient data is also impractical and costly. Therefore, there are many studies dedicated to tansfer learning in the literature, which aims to improve models with limited target resouce by utilizing annotated data from other domains. This paper focuses on the task of cross-lingual transfer learning, and explores how to use a large and well-collected monolingual question answering dataset to help improve the performance in other languages. For the task of cross-lingual transfer learning on question answering, most of the existng methods depend on an additional machine translation system to bridge the gap between languages, but it means that the translation quality of machine translation model is highly decisive for the performance of question answering model. That is, a well-trained machine translation system is required. However, the prerequisite is also hard to acheive in terms of some low-resourced languages. Therefore, this study takes a totally different approach. By combining language representation model pretrained on non-parallel multilingual data in a self-supervised way, we equipped our question answering model with an ability to understand different languages, and then successfully used English question answering data alone to train the model to build the question answering skill that could apply to different languages. In the end, we obtained a cross-lingual question answering model that can perform Chinese question answering. The study further explored the performance of our method on several artifial datasets with designed linguistic features, with an intention to find out the underlying mechanisms of the superior cross-lingual ability of our model. The experimental results showed that pretrained multilingual language representation model represents words with similar meanings in different languages with similar vectors, which could be interpreted as a phenomenon called cross-lingual alignment. It was found that this phenomenon may be caused by some code-switching data in the pre-training data of the language representation model. The study also found out the correlation of quantified cross-lingual alignment with the performance of cross-lingual transfer on downstream task. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/63049 |
| DOI: | 10.6342/NTU202000827 |
| 全文授權: | 有償授權 |
| 顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-109-1.pdf 未授權公開取用 | 5.07 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
