卷積濾波器修剪以用於小數據集上的遷移學習

Ching Ya Liao; 廖經亞

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69838

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	劉邦鋒(Pangfeng Liu)
dc.contributor.author	Ching Ya Liao	en
dc.contributor.author	廖經亞	zh_TW
dc.date.accessioned	2021-06-17T03:30:14Z	-
dc.date.available	2020-08-24
dc.date.copyright	2020-08-24
dc.date.issued	2020
dc.date.submitted	2020-08-19
dc.identifier.citation	[1] C. Baykal, L. Liebenwein, I. Gilitschenski, D. Feldman, and D. Rus. Data-dependent coresets for compressing neural networks with applications to generalization bounds. arXiv preprint arXiv:1804.05345, 2018. [2] D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber. Flexible, high performance convolutional neural networks for image classification. In Twenty-Second International Joint Conference on Artificial Intelligence, 2011. [3] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. [4] J. Elson, J. R. Douceur, J. Howell, and J. Saul. Asirra: a captcha that exploits interest-aligned manual image categorization. In ACM Conference on Computer and Communications Security, volume 7, pages 366–374, 2007. [5] J. Frankle and M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018. [6] S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally. Eie: efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News, 44(3):243–254, 2016. [7] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015. [8] S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pages 1135–1143, 2015. [9] Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang. Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866, 2018. [10] H. Hu, R. Peng, Y.-W. Tai, and C.-K. Tang. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250, 2016. [11] S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence, 35(1): 221–231, 2012. [12] H. Kagaya, K. Aizawa, and M. Ogawa. Food detection and recognition using convolutional neural network. In Proceedings of the 22nd ACM international conference on Multimedia, pages 1085–1088, 2014. [13] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 1725–1732, 2014. [14] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back. Face recognition: A convolutional neural-network approach. IEEE transactions on neural networks, 8(1):98–113, 1997. [15] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016. [16] Z. Li, L. Xu, and S. Zhu. Pruning of network filters for small dataset. IEEE Access, 8:4522–4533, 2019. [17] L. Liebenwein, C. Baykal, H. Lang, D. Feldman, and D. Rus. Provable filter pruning for efficient neural networks. arXiv preprint arXiv:1911.07412, 2019. [18] Z. Liu, M. Sun, T. Zhou, G. Huang, and T. Darrell. Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270, 2018. [19] J.-H. Luo, J. Wu, and W. Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision, pages 5058–5066, 2017. [20] C. J. Maddison, A. Huang, I. Sutskever, and D. Silver. Move evaluation in go using deep convolutional neural networks. arXiv preprint arXiv:1412.6564, 2014. [21] E. Malach, G. Yehudai, S. Shalev-Shwartz, and O. Shamir. Proving the lottery ticket hypothesis: Pruning is all you need. arXiv preprint arXiv:2002.00585, 2020. [22] P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440, 2016. [23] A. Morcos, H. Yu, M. Paganini, and Y. Tian. One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers. In Advances in Neural Information Processing Systems, pages 4932–4942, 2019. [24] M.-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics Image Processing, pages 722–729. IEEE, 2008. [25] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009. [26] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems, pages 8026–8037, 2019. [27] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016. [28] K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, pages 568–576, 2014. [29] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [30] H. Touvron, A. Vedaldi, M. Douze, and H. Jégou. Fixing the train-test resolution discrepancy: Fixefficientnet. arXiv preprint arXiv:2003.08237, 2020.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69838	-
dc.description.abstract	在這篇論文中，我們提出了一個方去可以從一個已經預訓練過的大模型，萃取出一個能分辨特定範圍之資料集的小模型。這個工作主要是使用了模型壓縮及遷移學習技術達成的。在壓縮的步驟，目的是要去選擇模型中對於我們的目標資料集敏感的部分，然後將這些選擇出來的部分進行遷移學習以產生一個較小並且客製化的模型。我們考慮影像辨識方面的應用，使用卷積神經網路模型結構。我們觀察到不同類別的圖片會激發不同的濾波器，因此我們可以透過這個特性去選擇模型裡敏感的部分。我們使用VGG-16 (使用ImageNet資料集預訓練)作為大模型，並且進行遷移學習在兩個不同的資料集上(Flowers-102和Cats vs. Dogs)。我們應用三種不同的修剪規則到我們的方法上。其中表現最好的在Flowers-102資料集上，可以在壓縮比率到達0.4時損失不到2\%的準確率。我們也修改了一些我們提出的方法的設定，去檢驗他們的可行性。我們提出的方法可以找出適合目標資料集的模型結構和參數，使得遷移學習能夠更有效率。	zh_TW
dc.description.abstract	In this paper, we propose a scheme to obtain a reduced model for a domain-specific dataset from a pretrained big model. This is basically achieved by combining techniques of model compression and transfer learning. The compression step aims to select the sensitive parts of a big model for our target dataset, and then we do transfer learning on the selected part to construct a reduced and customized model. We consider the application of image classification, using the structure of convolutional neural networks. We observe that different image categories activate different filters, so we can select sensitive parts of the model through this property. We take VGG-16 (pretrained by ImageNet dataset) as the big model and we do transfer learning on two different datasets (Flowers-102 and Cats vs. Dogs). We apply three pruning criteria to our scheme. The accuracy produced by the best criterion drops less than 2\% when the pruning ratio is 0.4 on Flower-102 dataset. We also change some settings of our scheme and examine their feasibility. Our scheme can find suitable pruned structure and parameters for our target dataset, which brings out more efficient transfer learning.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T03:30:14Z (GMT). No. of bitstreams: 1 U0001-1808202000483300.pdf: 855577 bytes, checksum: 422cddcea742d1cc934324fa4e231aa8 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Works 4 2.1 Connection Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Filter Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Relation to our Work . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 3 Background 8 3.1 Convolution Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Filter Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 Pruning Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3.1 Minimum Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.2 Mean Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.3 Average Percentage of Zeros (APoZ) . . . . . . . . . . . . . . . . . 11 Chapter 4 Scheme 12 4.1 Compute Importance of Filters . . . . . . . . . . . . . . . . . . . . . 12 4.2 Sort Filters by Importance . . . . . . . . . . . . . . . . . . . . . . . 13 4.3 Pruning Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3.1 Average Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3.2 Mean Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3.3 Average Percentage of Zeros . . . . . . . . . . . . . . . . . . . . . 14 Chapter 5 Evaluation 16 5.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5.2 Global Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5.3 Pruning by Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.4 Retraining with Random Initialization . . . . . . . . . . . . . . . . . 21 Chapter 6 Conclusion 23 References 25
dc.language.iso	en
dc.title	卷積濾波器修剪以用於小數據集上的遷移學習	zh_TW
dc.title	Convolution Filter Pruning for Transfer Learning on Small Dataset	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	吳真貞(Jan-Jan Wu),林敬棋(Ching-Chi Lin)
dc.subject.keyword	模型壓縮,濾波器修剪,遷移學習,	zh_TW
dc.subject.keyword	Model Compression,Filter Pruning,Transfer Learning,	en
dc.relation.page	28
dc.identifier.doi	10.6342/NTU202003895
dc.rights.note	有償授權
dc.date.accepted	2020-08-20
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-1808202000483300.pdf 目前未授權公開取用	835.52 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。