預訓練模型增加語言之高效率適應方法

陳柏衡; Po-Heng Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92733

標題:	預訓練模型增加語言之高效率適應方法 Efficient Unseen Language Adaptation for Multilingual Pre-Trained Language Models
作者:	陳柏衡 Po-Heng Chen
指導教授:	陳縕儂 Yun-Nung Chen
關鍵字:	自然語言理解,跨語言遷移,軟提示,輕量化微調,語言模型, Natural Language Understanding,Cross Lingual Transfer,Soft Prompt,Parameter-Efficient Fine-Tuning,Language Model,
出版年 :	2024
學位:	碩士
摘要:	多語言預訓練語言模型（mPLMs）已在零樣本跨語言轉移任務中展示了顯著的能力。具體來說，它們可以僅在來源語言的任務上進行微調，然後應用於目標語言的任務。然而，對於預訓練過程中未見的低資源語言，僅依賴零樣本跨語言轉移通常會產生較差的結果。一種常見的策略是在目標語言上繼續使用遮罩預測來繼續訓練模型。但是，由於需要調整所有參數以進行語言適應，這樣的方法效率不彰。在本篇論文中，我們提出了一種更有效的解決方案：用於語言適應的軟提示微調。我們的實驗發現，通過特別設計的軟提示來調整多語言模型，使模型能夠實現對以前未見過的語言的下游任務進行有效的零樣本跨語言轉移。值得注意的是，我們發現，相對於傳統的微調方法，提示調整在兩個的文本分類的資料集上，都展現了更佳的零樣本跨語言轉移表現，同時僅利用了調整參數的百分之0.28。這些結果強調了相對於傳統微調方法，軟提示調整可以為預訓練模型提供更加有效且高效的新增語言適應。 Multilingual pre-trained language models (mPLMs) have demonstrated notable effectiveness in zero-shot cross-lingual transfer tasks. Specifically, they can be fine-tuned solely on tasks in the source language and subsequently applied to tasks in the target language. However, for low-resource languages unseen during pre-training, relying solely on zero-shot language transfer often yields sub-optimal results. One common strategy is to continue training mPLMs using mask language modeling objectives on the target language. Nonetheless, this approach can be inefficient due to the need to adjust all parameters for language adaptation. In this paper, we propose a more efficient solution: soft-prompt tuning for language adaptation. Our experiments demonstrate that with carefully designed prompts, soft-prompt tuning enables mPLMs to achieve effective zero-shot cross-lingual transfer to downstream tasks in previously unseen languages. Notably, we found that prompt tuning outperforms continuously trained baselines on two text classification benchmarks, encompassing 18 low-resource languages, while utilizing a mere 0.28% of the tuned parameters. These results underscore the superior adaptability of mPLMs to previously unseen languages afforded by soft-prompt tuning compared to traditional fine-tuning methods.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92733
DOI:	10.6342/NTU202401193
全文授權:	同意授權(限校園內公開)
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	2.03 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。