深度神經網路行列剪枝的近似算法和模擬退火啟發式算法

杜秉翰; Ping-Han Tu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93313

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	劉邦鋒	zh_TW
dc.contributor.advisor	Pangfeng Liu	en
dc.contributor.author	杜秉翰	zh_TW
dc.contributor.author	Ping-Han Tu	en
dc.date.accessioned	2024-07-29T16:12:04Z	-
dc.date.available	2024-07-30	-
dc.date.copyright	2024-07-29	-
dc.date.issued	2024	-
dc.date.submitted	2024-06-03	-
dc.identifier.citation	[1] L. Bossard, M. Guillaumin, and L. Van Gool. Food101 – mining discriminative components with random forests. In D. Fleet, T. Pajdla, B. Schiele, and T. Tuyte laars, editors, Computer Vision – ECCV 2014, pages 446–461, Cham, 2014. Springer International Publishing. [2] Y. Cai, Y. Zhou, Q. Han, J. Sun, X. Kong, J. Li, and X. Zhang. Reversible column networks. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 15, 2023 . OpenReview.net, 2023. [3] K. Chellapilla, S. Puri, and P. Simard. High Performance Convolutional Neural Networks for Document Processing. In G. Lorette, editor, Tenth International Workshop on Frontiers in Handwriting Recognition, La Baule (France), Oct. 2006. Université de Rennes 1, Suvisoft. http://www.suvisoft.com. [4] S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. cudnn: Efficient primitives for deep learning. CoRR, abs/1410.0759, 2014. [5] Y. Gong, L. Liu, M. Yang, and L. Bourdev. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014. [6] C. Guo, B. Y. Hsueh, J. Leng, Y. Qiu, Y. Guan, Z. Wang, X. Jia, X. Li, M. Guo, 33 doi:10.6342/NTU202200147 and Y. Zhu. Accelerating sparse dnn models without hardwaresupport via tilewise sparsity. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’20. IEEE Press, 2020. [7] S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J. Dally. Eie: efficient inference engine on compressed deep neural network. SIGARCH Comput. Archit. News, 44(3):243–254, jun 2016. [8] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015. [9] S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015. [10] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. [11] Y. He, X. Zhang, and J. Sun. Channel pruning for accelerating very deep neural networks. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 1398–1406, 2017. [12] Intel. Intel math kernel library (intel mkl). https://software.intel.com/en-us/intel-mkl. [13] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017. 34 doi:10.6342/NTU202200147 [14] Y. Le and X. Yang. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015. [15] V. Lebedev and V. Lempitsky. Fast convnets using groupwise brain damage. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2554–2564, 2016. [16] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 2426, 2017, Conference Track Proceedings . OpenReview.net, 2017. [17] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015. [18] A. K. Mishra, J. A. Latorre, J. Pool, D. Stosic, D. Stosic, G. Venkatesh, C. Yu, and P. Micikevicius. Accelerating sparse deep neural networks. CoRR, abs/2104.08378, 2021. [19] P. Molchanov, A. Mallya, S. Tyree, I. Frosio, and J. Kautz. Importance estimation for neural network pruning. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11256–11264, 2019. [20] NVIDIA. Nvidia cuda basic linear algebra subroutines (cublas) library. https://developer.nvidia.com/cublas. [21] NVIDIA. Nvidia a100 tensor core gpu architecture. https://images.nvidia.com/aemdam/enzz/Solutions/datacenter/nvidiaamperearchitecturewhitepaper.pdf, 2020. 35 doi:10.6342/NTU202200147 [22] A. Parashar, M. Rhu, A. Mukkara, A. Puglielli, R. Venkatesan, B. Khailany, J. S. Emer, S. W. Keckler, and W. J. Dally. SCNN: an accelerator for compressedsparse convolutional neural networks. CoRR, abs/1708.04485, 2017. [23] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, highper formance deep learning library. Advances in neural information processing systems, 32, 2019. [24] J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017. [25] S. Ren, K. He, R. Girshick, and J. Sun. Faster rcnn: Towards realtime object detection with region proposal networks. Advances in neural information processing systems, 28, 2015. [26] O. Ronneberger, P. Fischer, and T. Brox. Unet: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 59, 2015, Proceedings, Part III 18 , pages 234–241. Springer, 2015. [27] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015. [28] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 79, 2015, Conference Track Proceedings, 2015. 36 doi:10.6342/NTU202200147 [29] M. Tan and Q. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 6105–6114. PMLR, 09–15 Jun 2019. [30] W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li. Learning structured sparsity in deep neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 2082–2090, Red Hook, NY, USA, 2016. Curran Associates Inc. [31] R. Yu, A. Li, C.F. Chen, J.H. Lai, V. I. Morariu, X. Han, M. Gao, C.Y. Lin, and L. S. Davis. Nisp: Pruning networks using neuron importance score propagation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9194–9203, 2018.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93313	-
dc.description.abstract	卷積神經網路（CNN）在計算機視覺和其他科學領域取得了巨大成功。儘管取得了成就，但最先進的 CNN 模型已經變得龐大，需要大量的計算能力和記憶體資源。作為一種模型壓縮技術，參數剪枝可以降低 CNN 模型所需的計算和記憶體。之前的工作提出了一種參數剪枝方法，通過消除權重矩陣的行和列來壓縮神經網路。移除行和列可以保留權重矩陣的密集結構，而不是非結構化剪枝產生的稀疏結構。儘管行和列剪枝有效地減小了模型的大小，但在不犧牲準確性的前提下，使模型盡可能地小巧仍然是一個挑戰。我们證明了行和列剪枝問题是一個 NP完全問題，並提出了兩個解為最佳解兩倍以內的近似演算法。我們還表明，這兩種算法的近似比無法更低。最後，我們提出了兩種基於模擬退火的方案來解決行和列剪枝問題。這兩種提出的方案在使用 Food-101 資料集的 ResNet-18 模型上，在 93% 的稀疏度下分别將準確度提高了 2.7% 和 1.6% 。	zh_TW
dc.description.abstract	Convolutional neural networks(CNNs) have achieved immense success in computer vision and other field of science. Despite the achievements, state-of-the-art CNN models have grown into gigantic sizes, which demand much computing power and memory resources. As a model compression technique, parameter pruning can lower the computation and memory a CNN model needs. Previous work has proposed a parameter pruning method, which compresses the neural networks by eliminating the rows and columns of the weight matrices. Removing rows and columns preserves the dense structure of the weight matrix, rather than the sparse structure produced by unstructured pruning. While row-and-column pruning effectively reduces the model size, making the model as compact as possible without sacrificing accuracy remains a challenge. We prove that the row-and-column pruning problem is an NP-complete problem, and we propose two approximation algorithms that provide solutions within twice the cost of the optimal solution. We also show that the approximation ratio cannot be improved for these two algorithms. Finally, we propose two schemes based on simulated annealing to solve the row-and-column pruning problem. The two proposed schemes improve the accuracy by 2.7% and 1.6% respectively, on the ResNet-18 model with 93% sparsity using the Food-101 dataset.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-29T16:12:04Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-07-29T16:12:04Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員審定書 i 致謝 ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables x Chapter 1 Introduction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 2 Related Work 4 2.1 Unstructured Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Structured Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 GeMM-based Convolution . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 3 Complexity 9 3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 NP-completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Chapter 4 Two Approximation Algorithms 12 Chapter 5 Scheme 16 5.1 Successive Convolutional Layers . . . . . . . . . . . . . . . . . . . 16 5.2 Global Row-then-column Pruning . . . . . . . . . . . . . . . . . . . 18 5.3 Simulated Annealing Based Pruning . . . . . . . . . . . . . . . . . . 19 5.3.1 Local SA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.3.2 Neighboring SA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.4 Row-column Pruning for Residual Networks . . . . . . . . . . . . . 22 Chapter 6 Experiment 25 6.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.1.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.1.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.1.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.2 Performance of Row-and-column Pruning Methods . . . . . . . . . . 26 6.3 Acceleration of Row-and-column Pruning . . . . . . . . . . . . . . . 29 Chapter 7 Conclusion and Future Work 32 References 33	-
dc.language.iso	en	-
dc.subject	參數剪枝	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	模擬退火	zh_TW
dc.subject	近似演算法	zh_TW
dc.subject	NP 完備	zh_TW
dc.subject	行列剪枝	zh_TW
dc.subject	Simulated annealing	en
dc.subject	Deep learning	en
dc.subject	Row-and-column pruning	en
dc.subject	Parameter pruning	en
dc.subject	Approximation algorithms	en
dc.subject	NP-completeness	en
dc.title	深度神經網路行列剪枝的近似算法和模擬退火啟發式算法	zh_TW
dc.title	Approximation Algorithms and Simulated Annealing Heuristics for Row-and-Column Pruning of Deep Neural Networks	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	洪鼎詠;吳真貞	zh_TW
dc.contributor.oralexamcommittee	Ding-Yong Hong;Jan-Jan Wu	en
dc.subject.keyword	深度學習,參數剪枝,行列剪枝,NP 完備,近似演算法,模擬退火,	zh_TW
dc.subject.keyword	Deep learning,Parameter pruning,Row-and-column pruning,,NP-completeness,Approximation algorithms,Simulated annealing,	en
dc.relation.page	37	-
dc.identifier.doi	10.6342/NTU202401008	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2024-06-03	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊網路與多媒體研究所	-
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	861.15 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。