Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69838
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉邦鋒(Pangfeng Liu)
dc.contributor.authorChing Ya Liaoen
dc.contributor.author廖經亞zh_TW
dc.date.accessioned2021-06-17T03:30:14Z-
dc.date.available2020-08-24
dc.date.copyright2020-08-24
dc.date.issued2020
dc.date.submitted2020-08-19
dc.identifier.citation[1] C. Baykal, L. Liebenwein, I. Gilitschenski, D. Feldman, and D. Rus. Data-dependent coresets for compressing neural networks with applications to generalization bounds. arXiv preprint arXiv:1804.05345, 2018.
[2] D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber. Flexible, high performance convolutional neural networks for image classification. In Twenty-Second International Joint Conference on Artificial Intelligence, 2011.
[3] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
[4] J. Elson, J. R. Douceur, J. Howell, and J. Saul. Asirra: a captcha that exploits interest-aligned manual image categorization. In ACM Conference on Computer and Communications Security, volume 7, pages 366–374, 2007.
[5] J. Frankle and M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
[6] S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally. Eie: efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News, 44(3):243–254, 2016.
[7] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
[8] S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pages 1135–1143, 2015.
[9] Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang. Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866, 2018.
[10] H. Hu, R. Peng, Y.-W. Tai, and C.-K. Tang. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250, 2016.
[11] S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence, 35(1): 221–231, 2012.
[12] H. Kagaya, K. Aizawa, and M. Ogawa. Food detection and recognition using convolutional neural network. In Proceedings of the 22nd ACM international conference on Multimedia, pages 1085–1088, 2014.
[13] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 1725–1732, 2014.
[14] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back. Face recognition: A convolutional neural-network approach. IEEE transactions on neural networks, 8(1):98–113, 1997.
[15] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
[16] Z. Li, L. Xu, and S. Zhu. Pruning of network filters for small dataset. IEEE Access, 8:4522–4533, 2019.
[17] L. Liebenwein, C. Baykal, H. Lang, D. Feldman, and D. Rus. Provable filter pruning for efficient neural networks. arXiv preprint arXiv:1911.07412, 2019.
[18] Z. Liu, M. Sun, T. Zhou, G. Huang, and T. Darrell. Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270, 2018.
[19] J.-H. Luo, J. Wu, and W. Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of the IEEE international conference on computer vision, pages 5058–5066, 2017.
[20] C. J. Maddison, A. Huang, I. Sutskever, and D. Silver. Move evaluation in go using deep convolutional neural networks. arXiv preprint arXiv:1412.6564, 2014.
[21] E. Malach, G. Yehudai, S. Shalev-Shwartz, and O. Shamir. Proving the lottery ticket hypothesis: Pruning is all you need. arXiv preprint arXiv:2002.00585, 2020.
[22] P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440, 2016.
[23] A. Morcos, H. Yu, M. Paganini, and Y. Tian. One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers. In Advances in Neural Information Processing Systems, pages 4932–4942, 2019.
[24] M.-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics Image Processing, pages 722–729. IEEE, 2008.
[25] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009.
[26] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems, pages 8026–8037, 2019.
[27] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
[28] K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, pages 568–576, 2014.
[29] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[30] H. Touvron, A. Vedaldi, M. Douze, and H. Jégou. Fixing the train-test resolution discrepancy: Fixefficientnet. arXiv preprint arXiv:2003.08237, 2020.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69838-
dc.description.abstract在這篇論文中,我們提出了一個方去可以從一個已經預訓練過的大模型,萃取出一個能分辨特定範圍之資料集的小模型。這個工作主要是使用了模型壓縮及遷移學習技術達成的。在壓縮的步驟,目的是要去選擇模型中對於我們的目標資料集敏感的部分,然後將這些選擇出來的部分進行遷移學習以產生一個較小並且客製化的模型。我們考慮影像辨識方面的應用,使用卷積神經網路模型結構。我們觀察到不同類別的圖片會激發不同的濾波器,因此我們可以透過這個特性去選擇模型裡敏感的部分。
我們使用VGG-16 (使用ImageNet資料集預訓練)作為大模型,並且進行遷移學習在兩個不同的資料集上(Flowers-102和Cats vs. Dogs)。我們應用三種不同的修剪規則到我們的方法上。其中表現最好的在Flowers-102資料集上,可以在壓縮比率到達0.4時損失不到2\%的準確率。我們也修改了一些我們提出的方法的設定,去檢驗他們的可行性。我們提出的方法可以找出適合目標資料集的模型結構和參數,使得遷移學習能夠更有效率。
zh_TW
dc.description.abstractIn this paper, we propose a scheme to obtain a reduced model for a domain-specific dataset from a pretrained big model. This is basically achieved by combining techniques of model compression and transfer learning. The compression step aims to select the sensitive parts of a big model for our target dataset, and then we do transfer learning on the selected part to construct a reduced and customized model. We consider the application of image classification, using the structure of convolutional neural networks. We observe that different image categories activate different filters, so we can select sensitive parts of the model through this property.
We take VGG-16 (pretrained by ImageNet dataset) as the big model and we do transfer learning on two different datasets (Flowers-102 and Cats vs. Dogs). We apply three pruning criteria to our scheme. The accuracy produced by the best criterion drops less than 2\% when the pruning ratio is 0.4 on Flower-102 dataset. We also change some settings of our scheme and examine their feasibility. Our scheme can find suitable pruned structure and parameters for our target dataset, which brings out more efficient transfer learning.
en
dc.description.provenanceMade available in DSpace on 2021-06-17T03:30:14Z (GMT). No. of bitstreams: 1
U0001-1808202000483300.pdf: 855577 bytes, checksum: 422cddcea742d1cc934324fa4e231aa8 (MD5)
Previous issue date: 2020
en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents vi
List of Figures viii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2 Related Works 4
2.1 Connection Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Filter Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Relation to our Work . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 3 Background 8
3.1 Convolution Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Filter Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Pruning Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.1 Minimum Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.2 Mean Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.3 Average Percentage of Zeros (APoZ) . . . . . . . . . . . . . . . . . 11
Chapter 4 Scheme 12
4.1 Compute Importance of Filters . . . . . . . . . . . . . . . . . . . . . 12
4.2 Sort Filters by Importance . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Pruning Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3.1 Average Weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.2 Mean Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.3 Average Percentage of Zeros . . . . . . . . . . . . . . . . . . . . . 14
Chapter 5 Evaluation 16
5.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.2 Global Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.3 Pruning by Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.4 Retraining with Random Initialization . . . . . . . . . . . . . . . . . 21
Chapter 6 Conclusion 23
References 25
dc.language.isoen
dc.title卷積濾波器修剪以用於小數據集上的遷移學習zh_TW
dc.titleConvolution Filter Pruning for Transfer Learning on Small Dataseten
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee吳真貞(Jan-Jan Wu),林敬棋(Ching-Chi Lin)
dc.subject.keyword模型壓縮,濾波器修剪,遷移學習,zh_TW
dc.subject.keywordModel Compression,Filter Pruning,Transfer Learning,en
dc.relation.page28
dc.identifier.doi10.6342/NTU202003895
dc.rights.note有償授權
dc.date.accepted2020-08-20
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1808202000483300.pdf
  目前未授權公開取用
835.52 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved