邊緣設備上的高效增量學習模型結構選擇

洪靖嘉; Jing-Jia Hung

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91187

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	楊佳玲	zh_TW
dc.contributor.advisor	Chia-Lin Yang	en
dc.contributor.author	洪靖嘉	zh_TW
dc.contributor.author	Jing-Jia Hung	en
dc.date.accessioned	2023-11-28T16:10:06Z	-
dc.date.available	2024-03-23	-
dc.date.copyright	2023-11-28	-
dc.date.issued	2023	-
dc.date.submitted	2023-11-20	-
dc.identifier.citation	[1] Hongjoon Ahn, Jihwan Kwak, Subin Lim, Hyeonsu Bang, Hyojun Kim, and Taesup Moon. Ss-il: Separated softmax for incremental learning. In Proceedings of the IEEE/CVF International conference on computer vision, pages 844–853, 2021. [2] Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, and Tinne Tuytelaars. Memory aware synapses: Learning what (not) to forget. In Pro- ceedings of the European conference on computer vision (ECCV), pages 139–154, 2018. [3] Haoli Bai, Jiaxiang Wu, Irwin King, and Michael Lyu. Few shot network com- pression via cross distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34(04), pages 3203–3210, 2020. [4] Eden Belouadah and Adrian Popescu. Il2m: Class incremental learning with dual memory. In Proceedings of the IEEE/CVF international conference on computer vision, pages 583–592, 2019. [5] Pietro Buzzega, Matteo Boschini, Angelo Porrello, Davide Abati, and Simone Calderara. Dark experience for general continual learning: a strong, simple base- line. Advances in neural information processing systems, 33:15920–15930, 2020. [6] Francisco M Castro, Manuel J Marín-Jiménez, Nicolás Guil, Cordelia Schmid, and Karteek Alahari. End-to-end incremental learning. In Proceedings of the European conference on computer vision (ECCV), pages 233–248, 2018. [7] ArslanChaudhry,PuneetKDokania,ThalaiyasingamAjanthan,andPhilipHSTorr. Riemannian walk for incremental learning: Understanding forgetting and intransi- gence. In Proceedings of the European conference on computer vision (ECCV), pages 532–547, 2018. [8] Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajan- than, Puneet K Dokania, Philip HS Torr, and Marc’Aurelio Ranzato. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019. [9] Trevor Gale, Erich Elsen, and Sara Hooker. The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574, 2019. [10] Yiwen Guo, Anbang Yao, and Yurong Chen. Dynamic network surgery for efficient dnns. Advances in neural information processing systems, 29, 2016. [11] Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015. [12] Babak Hassibi and David Stork. Second order derivatives for network pruning: Op- timal brain surgeon. Advances in neural information processing systems, 5, 1992. [13] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [14] Yang He, Guoliang Kang, Xuanyi Dong, Yanwei Fu, and Yi Yang. Soft fil- ter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866, 2018. [15] YangHe,PingLiu,ZiweiWang,ZhilanHu,andYiYang.Filterpruningviageomet- ric median for deep convolutional neural networks acceleration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4340– 4349, 2019. [16] Yang He and Lingao Xiao. Structured pruning for deep convolutional neural net- works: A survey. arXiv preprint arXiv:2303.00566, 2023. [17] Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 831–839, 2019. [18] Zejiang Hou, Minghai Qin, Fei Sun, Xiaolong Ma, Kun Yuan, Yi Xu, Yen-Kuang Chen, Rong Jin, Yuan Xie, and Sun-Yuan Kung. Chex: Channel exploration for cnn model compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12287–12298, 2022. [19] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017. [20] Shibo Jie, Zhi-Hong Deng, and Ziheng Li. Alleviating representational shift for continual fine-tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3810–3819, 2022. [21] James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Des- jardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017. [22] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto, 2009. [23] YaLeandXuanYang.Tinyimagenetvisualrecognitionchallenge.CS231N,7(7):3, 2015. [24] Yann LeCun, John Denker, and Sara Solla. Optimal brain damage. Advances in neural information processing systems, 2, 1989. [25] Namhoon Lee, Thalaiyasingam Ajanthan, and Philip HS Torr. Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340, 2018. [26] Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016. [27] TianhongLi,JianguoLi,ZhuangLiu,andChangshuiZhang.Fewsampleknowledge distillation for efficient network compression. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 14639–14647, 2020. [28] Yawei Li, Kamil Adamczewski, Wen Li, Shuhang Gu, Radu Timofte, and Luc Van Gool. Revisiting random channel pruning for neural network compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, pages 191–201, 2022. [29] Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, and Rongrong Ji. Exploiting kernel sparsity and en- tropy for interpretable cnn compression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2800–2809, 2019. [30] Zhizhong Li and Derek Hoiem. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017. [31] ZhuangLiu,MingjieSun,TinghuiZhou,GaoHuang,andTrevorDarrell.Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270, 2018. [32] Ekdeep Singh Lubana and Robert P Dick. A gradient flow framework for analyzing network pruning. arXiv preprint arXiv:2009.11839, 2020. [33] Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision, pages 5058–5066, 2017. [34] Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, and Jan Kautz. Im- portance estimation for neural network pruning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11264–11272, 2019. [35] Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, and Bin Ren. Patdnn: Achieving real-time dnn execution on mobile devices with pattern-based weight pruning. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 907–922, 2020. [36] Manuel Nonnenmacher, Thomas Pfeil, Ingo Steinwart, and David Reeb. Sosp: Efficiently capturing global correlations by second-order structured pruning. arXiv preprint arXiv:2110.11395, 2021. [37] Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001– 2010, 2017. [38] Alex Renda, Jonathan Frankle, and Michael Carbin. Comparing rewinding and fine- tuning in neural network pruning. arXiv preprint arXiv:2003.02389, 2020. [39] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang- Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510– 4520, 2018. [40] ChengchaoShen,XinchaoWang,YoutanYin,JieSong,SihuiLuo,andMingliSong. Progressive network grafting for few-shot knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 2541–2549, 2021. [41] Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, and Jose M Alvarez. Structural pruning via latency-saliency knapsack. Advances in Neural Information Processing Systems, 35:12894–12908, 2022. [42] Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. Continual learning with deep generative replay. Advances in neural information processing systems, 30, 2017. [43] Christian Simon, Piotr Koniusz, and Mehrtash Harandi. On learning the geodesic path for incremental learning. In Proceedings of the IEEE/CVF conference on Com- puter Vision and Pattern Recognition, pages 1591–1600, 2021. [44] Sidak Pal Singh and Dan Alistarh. Woodfisher: Efficient second-order approxima- tions for model compression. arXiv preprint arXiv:2004.14340, 2020. [45] Fu-Yun Wang, Da-Wei Zhou, Han-Jia Ye, and De-Chuan Zhan. Foster: Feature boosting and compression for class-incremental learning. In European conference on computer vision, pages 398–414. Springer, 2022. [46] Guo-Hua Wang and Jianxin Wu. Practical network acceleration with tiny sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, pages 20331–20340, 2023. [47] Huanyu Wang, Junjie Liu, Xin Ma, Yang Yong, Zhenhua Chai, and Jianxin Wu. Compressing models with few samples: Mimicking then replacing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 701–710, 2022. [48] Ze-HanWang,ZhenliHe,HuiFang,Yi-XiongHuang,YingSun,YuYang,Zhi-Yuan Zhang, and Di Liu. Efficient on-device incremental learning by weight freezing. In 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), pages 538–543. IEEE, 2022. [49] Zifeng Wang, Zheng Zhan, Yifan Gong, Geng Yuan, Wei Niu, Tong Jian, Bin Ren, Stratis Ioannidis, Yanzhi Wang, and Jennifer Dy. Sparcl: Sparse continual learning on the edge. Advances in Neural Information Processing Systems, 35:20366–20380, 2022. [50] Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, and Yun Fu. Large scale incremental learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 374–382, 2019. [51] Ye Xiang, Ying Fu, Pan Ji, and Hua Huang. Incremental learning using conditional adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6619–6628, 2019. [52] Friedemann Zenke, Ben Poole, and Surya Ganguli. Continual learning through synaptic intelligence. In International conference on machine learning, pages 3987– 3995. PMLR, 2017. [53] Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry Heck, Heming Zhang, and C-C Jay Kuo. Class-incremental learning via deep model con- solidation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1131–1140, 2020. [54] Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, and Cheng-Lin Liu. Prototype augmentation and self-supervision for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5871– 5880, 2021.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91187	-
dc.description.abstract	增量學習是一種訓練方式，根據不斷在離散時間到來的新資料類別，逐漸增加深度學習模型的知識。在增量學習的訓練方式下，過去的類別中可以完全不保留或保留一小部分數據，防止失去從過去的數據類別中學到的知識，也就是所謂的災難性遺忘。此外，邊緣設備的普及已經導致人工智能應用廣泛整合到這些平台上。由於邊緣設備的計算資源有限，深度學習模型通常會進行壓縮。將增量學習和模型壓縮結合的兩種直接方法是：(1) 在每次增量學習迭代後壓縮原始模型或(2)在初始步驟中壓縮原始模型，然後在隨後的增量學習迭代中直接使用壓縮好的模型。然而，我們觀察到，對於方法（1），即使使用最先進的壓縮方法，只需要少量樣本就可以微調壓縮模型，也存在擦除先前獲得的知識的風險。另一方面，方法（2）會降低準確性，因為固定的模型結構無法足夠適應新類別，這不適用於IL的動態性質。作為回應，我們提出了首個以類別為導向的模型結構選擇方法。我們所提出的方法首先確定適合新舊類別的模型結構，然後使用「聯合」方法和精細修剪來保存過去的知識。實驗結果顯示，在我們所有的測試案例中，我們所提出的方法相對於方法（1）和方法（2）分別平均提高了5.6％和 1.5％的準確性。與方法（1）相比，訓練時間減少了40％。	zh_TW
dc.description.abstract	Incremental learning (IL) is a training paradigm that gradually increases the "knowledge" of a Deep Learning (DL) model based on classes of new data constantly coming at discrete times. Under IL paradigm, none or only a small amount of data from the past classes can be kept, and it is crucial to prevent losing knowledge learned from past classes of data, also known as catastrophic forgetting. Moreover, the growing prevalence of edge devices has led to the widespread integration of AI-based applications on these platforms. Due to the limited computing resources of edge devices, DL models are commonly compressed. Two straightforward methods to combine IL training paradigm and model compression are either (a) compressing an original model after each IL iteration or (b) initially compressing the original model and using the compressed model for subsequent IL iterations. However, we observe that, with method (a), even if using the state-of-the-art compression method that needs only a small amount of samples to fine-tune the compressed model, risks erasing previously acquired knowledge. On the other hand, method (b) results in reduced accuracy because the fixed model structure is not adaptable enough to accommodate new classes, which is not suitable for the dynamic nature of IL. In response, we propose the first class-aware model structure selection method (CASS). The proposed CASS method first determine the suitable model structure for both new and old classes, then use "Union" method and fine-grained pruning to keep past knowledge well. The experimental results show that, among all our test cases, the proposed CASS method achieves 5.6% and 1.5% accuracy improvements on average compared to method (a) and method (b) respectively. While the time of training drops 40% compared to method (a).	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-11-28T16:10:06Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-11-28T16:10:06Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements i 摘要 ii Abstract iii Contents v List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Background & Motivation 7 2.1 Basics of Incremental Learning and Model Pruning 7 2.1.1 Incremental Learning 7 2.1.2 Model Pruning 8 2.2 Motivation 8 Chapter 3 CASS: Class-Aware Model Structure Selection 12 3.1 CASS overview 12 3.2 Class-aware pruning 14 3.3 Union 15 3.4 Model Refinement 16 3.5 Implementation Scenarios 18 Chapter 4 Experiments 20 4.1 Experiment Setup 20 4.2 Main Results 22 4.2.1 Results on TinyImagenet 22 4.2.2 Results on CIFAR-100 25 4.3 Ablation Study 28 4.4 Further Analyses 29 4.4.1 Compatibility Analysis 29 4.4.2 Different number of kept samples for old classes 31 4.4.3 Detailed IL environment Analysis 32 Chapter 5 Related Works 35 5.1 Incremental learning 35 5.2 Model Pruning 36 Chapter 6 Conclusion and Future Work 39 References 40	-
dc.language.iso	en	-
dc.title	邊緣設備上的高效增量學習模型結構選擇	zh_TW
dc.title	Class-Aware Model Structure Selection for Efficient Incremental Learning on Edge Devices	en
dc.type	Thesis	-
dc.date.schoolyear	112-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳依蓉;鄭湘筠;陳駿丞	zh_TW
dc.contributor.oralexamcommittee	Yi-Jung Chen;Hsiang-Yun Cheng;Jun-Cheng Chen	en
dc.subject.keyword	剪枝,增量學習,邊緣計算,	zh_TW
dc.subject.keyword	Pruning,Incremental learning,Edge Computing,	en
dc.relation.page	47	-
dc.identifier.doi	10.6342/NTU202301431	-
dc.rights.note	未授權	-
dc.date.accepted	2023-11-21	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊網路與多媒體研究所	-
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-112-1.pdf 目前未授權公開取用	2.34 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。