Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59139
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉邦鋒(Pang-Feng Liu)
dc.contributor.authorTse-Wen Chenen
dc.contributor.author陳則彣zh_TW
dc.date.accessioned2021-06-16T09:16:37Z-
dc.date.available2020-08-24
dc.date.copyright2020-08-24
dc.date.issued2020
dc.date.submitted2020-08-18
dc.identifier.citation1. Z. Cai and N. Vasconcelos. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6154–6162, 2018
2. A. Chaddad, B. Naisiri, M. Pedersoli, E. Granger, C. Desrosiers, and M. Toews. Modeling information flow through deep neural networks. arXiv preprint arXiv:1712.00003, 2017
3. Y. Gong, L. Liu, M. Yang, and L. Bourdev. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014
4. S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pages 1135–1143, 2015
5. S. H. Hasanpour, M. Rouhani, M. Fayyaz, and M. Sabokrou. Lets keep it simple, using simple architectures to outperform deeper and more complex architectures. arXiv preprint arXiv:1608.06037, 2016
6. H. Hu, R. Peng, Y.-W. Tai, and C.-K. Tang. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250, 2016
7. M. Jaderberg, A. Vedaldi, and A. Zisserman. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014
8. B. Kayalibay, G. Jensen, and P. van der Smagt. Cnn-based segmentation of medical imaging data. arXiv preprint arXiv:1701.03056, 2017
9. H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016
10. Y. Li, S. Lin, B. Zhang, J. Liu, D. Doermann, Y. Wu, F. Huang, and R. Ji. Exploiting kernel sparsity and entropy for interpretable cnn compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2800–2809, 2019
11. S. Lin, R. Ji, C. Chen, D. Tao, and J. Luo. Holistic cnn compression via low-rank decomposition with knowledge transfer. IEEE transactions on pattern analysis and machine intelligence, 41(12):2889–2905, 2018
12. J.-H. Luo and J. Wu. An entropy-based pruning method for cnn compression. arXiv preprint arXiv:1706.05791, 2017
13. F. Milletari, N. Navab, and S.-A. Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. IEEE, 2016
14. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems, pages 8026–8037, 2019
15. S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015
16. A. Salama, O. Ostapenko, T. Klein, and M. Nabi. Pruning at a glance: Global neural pruning for model compression. arXiv preprint arXiv:1912.00200, 2019
17. R. Shwartz-Ziv and N. Tishby. Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810, 2017
18. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014
19. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi. Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligence, 2017
20. M. Tan and Q. V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946, 2019
21. N. Tishby and N. Zaslavsky. Deep learning and the information bottleneck principle. In 2015 IEEE Information Theory Workshop (ITW), pages 1–5. IEEE, 2015
22. H. Touvron, A. Vedaldi, M. Douze, and H. Jégou. Fixing the train-test resolution discrepancy: Fixefficientnet. arXiv preprint arXiv:2003.08237, 2020
23. Wikipedia. Mutual information — Wikipedia, The Free Encyclopedia, 2020. Accessed 5-August-2020
24. T.-J. Yang, Y.-H. Chen, and V. Sze. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pages 5687–5695, 2017
25. A. Zhou, A. Yao, Y. Guo, L. Xu, and Y. Chen. Incremental network quantization: Towards lossless cnns with low-precision weights. arXiv preprint arXiv:1702.03044, 2017
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59139-
dc.description.abstract捲積神經網路在各種電腦視覺領域中都獲得了巨大的成功。然而,受限於硬體以及軟體的資源限制,網路壓縮是一項可以使捲積神經網路更小並且加速網路訓練以及推論的速度。我們著重在通道剪枝這一部分,這屬於網路壓縮的一個部分。首先評估捲積層各個通道的重要程度,並且剪枝去掉較不重要的通道。
在這篇論文中,我們提出了權重相互資訊這個方法,相較於傳統的L1范數方法以及其他熵相關的方法有更好的表現。我們首先計算特徵圖譜以及標籤兩者的相互資訊,並使用該相互資訊來移除熵中與分類無關的資訊。接著,我們會考慮通過一組權重對於一個連續的隨機變數的影響,並計算輸出的熵總和。
我們在三個資料集上使用網路Simplenet實現通道剪枝。這三個資料集分別為SVHN,CIFAR-10以及CIFAR-100。當參數百分比對所有捲積層設置為0.3時,我們提出的權重相互資訊方法會有比輸出L1這個方法分別對於SVHN,CIFAR-10和CIFAR-100高出1.52%,13.24%與7.90%的準確度。
在全域剪枝的實驗中,我們提出的權重相互資訊分法在參數百分比為0.45時對於SVHN資料集會比輸出L1方法高出2%的準確度。當參數百分比為0.53時,權重相互資訊方法對應CIFAR-100資料集相較輸出L1方法高出1.5%的準確度。唯一的例外是在對應CIFAR-10資料集時,權重相互資訊這個方法在參數百分比為0.40時,較輸出L1方法少了5%的準確度。但同樣情況下,利用高斯分布估計出的熵方法卻出乎意料的高出輸出L1方法約51%的準確度。
zh_TW
dc.description.abstractConvolutional neural network (CNN) achieves great success especially in computer vision tasks. However, due to the limitation of hardware/software resources, model compression is an important technique to make CNNs smaller and even faster to train or inference. We focus on channel pruning, which is a part of model compression, that evaluates the importance of each channel in a convolution layer and prune away the less importance channels.
In this paper, we propose the weighted mutual information metric which outperforms l1-norm pruning metric and other entropy metrics. We first compute the mutual information between feature maps and labels in order to remove information which is not relevant to classification task. We then consider the effect of passing a continuous random variable through filter weights and estimate the output entropy.
We perform channel pruning on three datasets, SVHN, CIFAR-10 and CIFAR-100 using the model Simplenet. When parameter percentage is 0.3 for all convolution layers, our weighted mutual information method has 1.52%, 13.24% and 7.90% more accuracy than the output L1 metric on SVHN dataset, CIFAR-10 dataset and CIFAR-100 dataset.
In the global pruning experiment, our weighted mutual information metric has about 2% more accuracy than output L1 metric when parameter ratio is about 0.45 on SVHN dataset. On CIFAR-100 dataset, our metric has about 1.5% more accuracy than output L1 metric when parameter ratio is about 0.53. The only exception is CIFAR-10 dataset, where our metric has about 5% worse than output L1 metric when parameter ratio is about 0.40. The entropy metric estimated according to Gaussian distribution unexpectedly outperforms output L1 metric by about 51% under same condition.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T09:16:37Z (GMT). No. of bitstreams: 1
U0001-1408202015505700.pdf: 1063076 bytes, checksum: 96cbe7ff0a591f93fb616ba4dd1434e7 (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Chapter 2 Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Information theory on neural network . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Unstructured pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Structured pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1 Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Convolution Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Channel Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
4.1 L1-norm of filter weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Entropy of feature map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 10
4.3 Mutual information between feature map and label . . . . . . . . . . . 10
4.4 Weighted Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Chapter 5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
5.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 14
5.2 Local Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.3 Global Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Chapter 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
dc.language.isozh-TW
dc.subject機器學習zh_TW
dc.subject網路壓縮zh_TW
dc.subject濾波器剪枝zh_TW
dc.subject熵zh_TW
dc.subject捲積神經網路zh_TW
dc.subjectentropyen
dc.subjectCNNen
dc.subjectmodel compressionen
dc.subjectfilter pruningen
dc.subjectmachine learningen
dc.title利用資料熵進行神經網路之壓縮zh_TW
dc.titleExploiting Data Entropy for Neural Network Compressionen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee吳真貞(Jan-Jan Wu),許俊琛(Jun-Chen XU)
dc.subject.keyword網路壓縮,濾波器剪枝,熵,捲積神經網路,機器學習,zh_TW
dc.subject.keywordmodel compression,filter pruning,entropy,CNN,machine learning,en
dc.relation.page30
dc.identifier.doi10.6342/NTU202003440
dc.rights.note有償授權
dc.date.accepted2020-08-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1408202015505700.pdf
  未授權公開取用
1.04 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved