Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85632
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor廖世偉(Shih-Wei Liao)
dc.contributor.authorChih-Chieh Wangen
dc.contributor.author王致傑zh_TW
dc.date.accessioned2023-03-19T23:20:09Z-
dc.date.copyright2022-07-12
dc.date.issued2022
dc.date.submitted2022-06-28
dc.identifier.citation1. Biesialska, M., K. Biesialska, and M.R. Costa-jussà, Continual lifelong learning in natural language processing: A survey. arXiv preprint arXiv:2012.09823, 2020. 2. Devlin, J., et al., Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. 3. Liu, Y., et al., Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019. 4. McCloskey, M. and N.J. Cohen, Catastrophic interference in connectionist networks: The sequential learning problem, in Psychology of learning and motivation. 1989, Elsevier. p. 109-165. 5. Rebuffi, S.-A., et al. icarl: Incremental classifier and representation learning. in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2017. 6. Castro, F.M., et al. End-to-end incremental learning. in Proceedings of the European conference on computer vision (ECCV). 2018. 7. Arumae, K. and P. Bhatia, CALM: Continuous Adaptive Learning for Language Modeling. arXiv preprint arXiv:2004.03794, 2020. 8. Xu, H., et al., Pre-trained models: Past, present and future. arXiv preprint arXiv:2106.07139, 2021. 3. 9. Ruder, S., An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098, 2017. 10. Strubell, E., A. Ganesh, and A. McCallum, Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243, 2019. 11. Ke, Z., H. Xu, and B. Liu, Adapting bert for continual learning of a sequence of aspect sentiment classification tasks. arXiv preprint arXiv:2112.03271, 2021. 12. Houlsby, N., et al. Parameter-efficient transfer learning for NLP. in International Conference on Machine Learning. 2019. PMLR. 13. Pfeiffer, J., et al., Adapterhub: A framework for adapting transformers. arXiv preprint arXiv:2007.07779, 2020. 14. Raffel, C., et al., Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683, 2019. 15. Pfeiffer, J., et al., AdapterFusion: Non-destructive task composition for transfer learning. arXiv preprint arXiv:2005.00247, 2020. 16. Vaswani, A., et al., Attention is all you need. Advances in neural information processing systems, 2017. 30. 17. McCann, B., et al., The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730, 2018. 18. Bhargava, P., A. Drozd, and A. Rogers, Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics. arXiv preprint arXiv:2110.01518, 2021. 19. Jiao, X., et al., Tinybert: Distilling bert for natural language understanding. arXiv preprint arXiv:1909.10351, 2019. 20. Radford, A., et al., Language models are unsupervised multitask learners. OpenAI blog, 2019. 1(8): p. 9. 21. Kirkpatrick, J., et al., Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 2017. 114(13): p. 3521-3526. 22. Bapna, A., N. Arivazhagan, and O. Firat, Simple, scalable adaptation for neural machine translation. arXiv preprint arXiv:1909.08478, 2019. 23. Artetxe, M., S. Ruder, and D. Yogatama, On the cross-lingual transferability of monolingual representations. arXiv preprint arXiv:1910.11856, 2019. 24. Pfeiffer, J., et al., Mad-x: An adapter-based framework for multi-task cross-lingual transfer. arXiv preprint arXiv:2005.00052, 2020. 25. Kim, H., et al., An Alternating Training Method of Attention-Based Adapters for Visual Explanation of Multi-Domain Satellite Images. IEEE Access, 2021. 9: p. 62332-62346. 26. Shazeer, N. and M. Stern. Adafactor: Adaptive learning rates with sublinear memory cost. in International Conference on Machine Learning. 2018. PMLR. 27. Loshchilov, I. and F. Hutter, Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85632-
dc.description.abstract在真實世界的應用中,新的資料或任務會逐一進入系統,神經網路模型一但直接學習這些新的任務便會導致災難性遺忘(Catastrophic Forgetting)的發生,這迫使模型在學習新資料的同時必須連同舊資料一起重新訓練,否則就必須訓練一個新的模型去對應新的任務,兩種方法皆造成時間或儲存空間上的資源浪費。 持續學習(Continual Learning)旨在研究如何讓深度神經網路在學習新的知識的同時不會遺忘過去所學習的知識。然而,過去的研究會因為模型限制導致其無法應對各種類型的任務,或無法提供可跟一般的深度學習模型比擬的效能。本文提出一個基於T5(Unified Text-to-Text Transformer)和適配器(Adapter)技術的持續學習框架:T5-CL。相較於預訓練模型,該框架在持續學習上對於時間和空間的利用效率更好,能在不做任何改動的情況下應對各種類型任務,且提供了穩定具競爭力的效能,此外,該框架提供了一個統一的端對端介面,使用者僅需處理資料前處理和提供該資料集的評分公式即可訓練,這幫助使用者節省了對於新任務的開發時間。zh_TW
dc.description.abstractIn the real-world application, new data or tasks will enter the system sequentially. Once learning these tasks directly, a deep neural network will suffer from serious forgetting, forcing engineers to re-trained the model with old and new data. Otherwise, it is required to train a new model corresponding to a new task, both of the ways result in a waste of resources in time or storage space. Continual learning aims to research how to let the deep neural network learn new knowledge without forgetting the knowledge learned in the past. However, due to architecture limitations, past approaches cannot deal with various types of tasks or provide comparable performance to general neural networks. Therefore, we proposed a continual learning framework: T5-CL, based on T5(Unified Text-to-Text Transformer) and adapter, this framework can cope with various types of tasks without modifying model architecture and can provide stable and competitive performance with better time and space efficiency. Moreover, this framework provides a unified end-to-end interface and users can save their development time since they only need to deal with data processing and provide performance criteria. We future demonstrated the utility of this framework through extensive experiments.en
dc.description.provenanceMade available in DSpace on 2023-03-19T23:20:09Z (GMT). No. of bitstreams: 1
U0001-2606202213290400.pdf: 3474640 bytes, checksum: 74f995fbe950f353fa5a850ee94bdd6e (MD5)
Previous issue date: 2022
en
dc.description.tableofcontents口試委員會審定書 I 誌謝 II 摘要 III Abstract IV 目錄 V 圖目錄 VII Chapter 1 Introduction 2 Chapter 2 Preliminary 5 2.1 Transformer 5 2.2 BERT 6 2.3 T5 10 Chapter 3 Related Work 12 3.1 Desired properties for a CL system 12 3.2 Related machine learning techniques 13 3.3 Related works to CL 14 3.3.1 Traditional methods 14 3.3.2 Adapter-BERT 15 3.4 Summary 20 Chapter 4 Method 21 4.1 T5-CL 21 4.2 Adapter fusion layer 24 Chapter 5 Experiments 27 5.1 Datasets 27 5.2 Baseline Methods 30 5.3 Experiment Setup 30 5.3.1 Training 30 5.3.2 Hyperparameters 31 5.4 Results and Analysis 31 Chapter 6 Conclusion 35 REFERENCE 36 Appendix 1 38
dc.language.isozh-TW
dc.subject預訓練模型zh_TW
dc.subject持續學習zh_TW
dc.subject災難性遺忘zh_TW
dc.subject適配器zh_TW
dc.subject端對端框架zh_TW
dc.subject持續學習zh_TW
dc.subject災難性遺忘zh_TW
dc.subject預訓練模型zh_TW
dc.subject適配器zh_TW
dc.subject端對端框架zh_TW
dc.subjectEnd-to-End Frameworken
dc.subjectContinual Learningen
dc.subjectCatastrophic Forgettingen
dc.subjectPre-trained Modelen
dc.subjectAdapteren
dc.subjectEnd-to-End Frameworken
dc.subjectCatastrophic Forgettingen
dc.subjectPre-trained Modelen
dc.subjectAdapteren
dc.subjectContinual Learningen
dc.titleT5-CL: 一個應用於不同類型分類任務的端對端持續學習框架zh_TW
dc.titleT5-CL: An End-to-End Continual Learning Framework Applying on Different Types of Classification Tasksen
dc.typeThesis
dc.date.schoolyear110-2
dc.description.degree碩士
dc.contributor.oralexamcommittee戴敏育(Min-Yuh Day),孫瑞鴻(Ray-Hon Sun)
dc.subject.keyword持續學習,災難性遺忘,預訓練模型,適配器,端對端框架,zh_TW
dc.subject.keywordContinual Learning,Catastrophic Forgetting,Pre-trained Model,Adapter,End-to-End Framework,en
dc.relation.page40
dc.identifier.doi10.6342/NTU202201121
dc.rights.note同意授權(全球公開)
dc.date.accepted2022-06-30
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
dc.date.embargo-lift2022-07-12-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-2606202213290400.pdf3.39 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved