請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85632| 標題: | T5-CL: 一個應用於不同類型分類任務的端對端持續學習框架 T5-CL: An End-to-End Continual Learning Framework Applying on Different Types of Classification Tasks |
| 作者: | Chih-Chieh Wang 王致傑 |
| 指導教授: | 廖世偉(Shih-Wei Liao) |
| 關鍵字: | 持續學習,災難性遺忘,預訓練模型,適配器,端對端框架, Continual Learning,Catastrophic Forgetting,Pre-trained Model,Adapter,End-to-End Framework, |
| 出版年 : | 2022 |
| 學位: | 碩士 |
| 摘要: | 在真實世界的應用中,新的資料或任務會逐一進入系統,神經網路模型一但直接學習這些新的任務便會導致災難性遺忘(Catastrophic Forgetting)的發生,這迫使模型在學習新資料的同時必須連同舊資料一起重新訓練,否則就必須訓練一個新的模型去對應新的任務,兩種方法皆造成時間或儲存空間上的資源浪費。 持續學習(Continual Learning)旨在研究如何讓深度神經網路在學習新的知識的同時不會遺忘過去所學習的知識。然而,過去的研究會因為模型限制導致其無法應對各種類型的任務,或無法提供可跟一般的深度學習模型比擬的效能。本文提出一個基於T5(Unified Text-to-Text Transformer)和適配器(Adapter)技術的持續學習框架:T5-CL。相較於預訓練模型,該框架在持續學習上對於時間和空間的利用效率更好,能在不做任何改動的情況下應對各種類型任務,且提供了穩定具競爭力的效能,此外,該框架提供了一個統一的端對端介面,使用者僅需處理資料前處理和提供該資料集的評分公式即可訓練,這幫助使用者節省了對於新任務的開發時間。 In the real-world application, new data or tasks will enter the system sequentially. Once learning these tasks directly, a deep neural network will suffer from serious forgetting, forcing engineers to re-trained the model with old and new data. Otherwise, it is required to train a new model corresponding to a new task, both of the ways result in a waste of resources in time or storage space. Continual learning aims to research how to let the deep neural network learn new knowledge without forgetting the knowledge learned in the past. However, due to architecture limitations, past approaches cannot deal with various types of tasks or provide comparable performance to general neural networks. Therefore, we proposed a continual learning framework: T5-CL, based on T5(Unified Text-to-Text Transformer) and adapter, this framework can cope with various types of tasks without modifying model architecture and can provide stable and competitive performance with better time and space efficiency. Moreover, this framework provides a unified end-to-end interface and users can save their development time since they only need to deal with data processing and provide performance criteria. We future demonstrated the utility of this framework through extensive experiments. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85632 |
| DOI: | 10.6342/NTU202201121 |
| 全文授權: | 同意授權(全球公開) |
| 電子全文公開日期: | 2022-07-12 |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2606202213290400.pdf | 3.39 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
