基於軟剪枝的可微分神經網路架構搜尋

馮泰哲; Tai-Che Feng

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88376

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	王勝德	zh_TW
dc.contributor.advisor	Sheng-De Wang	en
dc.contributor.author	馮泰哲	zh_TW
dc.contributor.author	Tai-Che Feng	en
dc.date.accessioned	2023-08-09T16:47:20Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-08-09	-
dc.date.issued	2023	-
dc.date.submitted	2023-07-25	-
dc.identifier.citation	[1] L. Cai, Z. An, and Y. Xu. GHFP: Gradually hard filter pruning. arXiv preprint arXiv:2011.03170, 2020. [2] L. Cai, Z. An, C. Yang, and Y. Xu. Softer pruning, incremental regularization. In 25th International Conference on Pattern Recognition (ICPR), pages 224 230. IEEE, 2021. [3] X. Chen, R. Wang, X. Tang, and C.-J. Hsieh. DrNAS: Dirichlet neural architecture search. In International Conference on Learning Representation (ICLR), 2021. [4] X. Chen, L. Xie, J. Wu, and Q. Tian. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1294–1303, 2019. [5] X. Chu, X. Wang, B. Zhang, S. Lu, X. Wei, and J. Yan. DARTS-: Robustly stepping out of performance collapse without indicators. In International Conference on Learning Representations (ICLR), 2021. [6] X. Chu, T. Zhou, B. Zhang, and J. Li. Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search. In 16th Europoean Conference On Computer Vision (ECCV), 2020. [7] X. Dong and Y. Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020. [8] J. Guo, W. Zhang, W. Ouyang, and D. Xu. Model compression using progressive channel pruning. IEEE Transactions on Circuits and Systems for Video Technology, 31(3):1114–1124, 2020. [9] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. International Conference on Learning Representations (ICLR), 2016. [10] S. Han, J. Pool, J. Tran, and W. J. Dally. Learning both weights and connections for efficient neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1, pages 1135–1143, 2015. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pages 770–778, 2016. [12] Y. He, Y. Ding, P. Liu, L. Zhu, H. Zhang, and Y. Yang. Learning filter pruning criteria for deep convolutional neural networks acceleration. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2006–2015. IEEE, 2020. [13] Y. He, X. Dong, G. Kang, Y. Fu, C. Yan, and Y. Yang. Asymptotic soft filter pruning for deep convolutional neural networks. IEEE Transactions on Cybernetics, pages 1–11, 2019. [14] Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang. Soft filter pruning for accelerating deep convolutional neural networks. In International Joint Conference on Artificial Intelligence (IJCAI), pages 2234–2240, 2018. [15] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017. [16] M. Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81–93, 1938. [17] G. Li, G. Qian, I. C. Delgadillo, M. Müller, A. Thabet, and B. Ghanem. SGAS: Sequential greedy architecture search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [18] C. Liu, P. Liu, W. Zhao, and X. Tang. Visual tracking by structurally optimizing pre-trained cnn. IEEE Transactions on Circuits and Systems for Video Technology, 30(9):3153–3166, 2019. [19] H. Liu, K. Simonyan, and Y. Yang. DARTS: Differentiable architecture search. In International Conference on Learning Representations (ICLR), 2019. [20] Z. Liu, M. Sun, T. Zhou, G. Huang, and T. Darrell. Rethinking the value of network pruning. In International Conference on Learning Representations (ICLR), 2019. [21] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019. [22] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510 4520, 2018. [23] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR), May 2015. [24] M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le. MnasNet: Platform aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2820–2828, 2019. [25] A. Wan, X. Dai, P. Zhang, Z. He, Y. Tian, S. Xie, B. Wu, M. Yu, T. Xu, K. Chen, et al. FBNetV2: Differentiable neural architecture search for spatial and channel dimensions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12965–12974, 2020. [26] R. Wang, M. Cheng, X. Chen, X. Tang, and C.-J. Hsieh. Rethinking architecture selection in differentiable NAS. In International Conference on Learning Representations (ICLR), 2021. [27] B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer. FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10734–10742, 2019. [28] S. Xie, H. Zheng, C. Liu, and L. Lin. SNAS: stochastic neural architecture search. In International Conference on Learning Representations (ICLR), 2019. [29] Y. Xu, Y. Wang, K. Han, Y. Tang, S. Jui, C. Xu, and C. Xu. ReNAS: Relativistic evaluation of neural architecture search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4411–4420, 2021. [30] Y. Xu, L. Xie, X. Zhang, X. Chen, G.-J. Qi, Q. Tian, and H. Xiong. PC-DARTS: Partial channel connections for memory-efficient architecture search. In International Conference on Learning Representations (ICLR), 2020. [31] P. Ye, B. Li, Y. Li, T. Chen, J. Fan, and W. Ouyang. β-DARTS: Beta-decay regularization for differentiable architecture search. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10864–10873. IEEE, 2022. [32] M. Zhang, S. W. Su, S. Pan, X. Chang, E. M. Abbasnejad, and R. Haffari. iDARTS: Differentiable architecture search with stochastic implicit gradients. In International Conference on Machine Learning, pages 12557–12566. PMLR, 2021. [33] Y. Zhou, X. Xie, and S.-Y. Kung. Exploiting operation importance for differentiable neural architecture search. IEEE Transactions on Neural Networks and Learning Systems, 33(11):6235–6248, 2021. [34] B. Zoph and Q. Le. Neural architecture search with reinforcement learning. In International Conference on Learning Representations (ICLR), 2017.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88376	-
dc.description.abstract	近年來，可微分神經網路架構搜尋因其簡單性和高效的搜索能力而引起越來越多的關注。然而，這樣的搜尋方法卻有很大的機會遭遇過度擬合，而造成搜索出來的模型性能崩潰的問題。因此本篇論文我們提出SP-DARTS，一種基於軟剪枝的架構搜尋方法來解決這個問題。首先和以往的搜尋方法不同的地方在於，我們認為這種可微分的架構搜尋方式可以看做是一個模型剪枝的問題，將不重要的操作從擁有所有可能架構的超網中刪除來得出最終的模型，並且認為一般傳統的硬剪枝方法會讓搜索空間的容量在訓練的過程中逐漸減少而導致不穩定的搜尋結果，所以在這裡我們嘗試使用軟剪枝的方式進行訓練。其次，原本的DARTS在訓練結束後會選擇擁有最大操作係數的操作來形成最終的架構，但我們發現這樣的作法並不能夠真正的反映出操作之間的重要程度，因此我們直接測量每個操作對超網的影響來當作他們重要程度的依據。我們將方法實作在NAS-Bench-201的搜索空間上，實驗結果顯示SP-DARTS和漸進式的SRP-DARTS都是一個穩健的搜尋方法，搜尋出來的架構有良好的性能及穩定的結果。	zh_TW
dc.description.abstract	Recently Differentiable Architecture Search (DARTS) has gained increasing attention due to its simplicity and efficient search capability. However, such search methods have a significant chance of encountering overfitting, which can result in the performance collapse problem of the discovered models. In this paper, we proposed SP-DARTS, a soft pruning-based differentiable architecture search method addressing this issue. Firstly, unlike previous search methods, we consider the differentiable architecture search process as a model pruning problem, which removes unimportant operations from the supernet that contains all possible architectures to obtain the final model. We also believe that the traditional hard pruning method would gradually reduce the capacity of the search space during training, leading to unstabilize search results, so we attempt to use the soft pruning approach in our training process. Secondly, the original DARTS method selects the operation with the maximum architecture parameter on each edge to form the final architecture after training. But we found that this approach cannot truly reflect their importance. Therefore, we estimate the impact on the supernet of each candidate operation instead directly looking at their architecture parameters. Finally, we implement our method on the NAS-Bench-201 search space, and the experimental results show that both SP-DARTS and the progressive approach SRP-DARTS are robust search methods that can search for architectures with good performance and stable results.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-09T16:47:20Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-08-09T16:47:20Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Works 3 2.1 NAS: Neural Architecture Search 3 2.2 DARTS: Differentiable Architecture Search 5 2.3 Model Pruning 7 2.4 Soft Pruning 8 Chapter 3 Approach 10 3.1 DARTS Pruning 10 3.2 SP-DARTS: Soft Pruning DARTS 12 3.3 Determine Operation Strength Using Validation Set 14 3.4 SP-DARTS Using Validation Accuracy 17 3.5 SRP-DARTS: SofteR Pruning DARTS 19 Chapter 4 Experiments 21 4.1 Search Space: NAS-Bench-201 21 4.2 Implementation Details 22 4.3 Search Results 22 Chapter 5 Ablation Study 25 5.1 Soft Pruning vs Hard Pruning 25 5.2 The Effect of Dropout 26 5.3 Validation Accuracy vs Alpha Value 28 5.4 The Impact of Validation Set Proportion 29 Chapter 6 Conclusion 31 References 33	-
dc.language.iso	en	-
dc.subject	模型剪枝	zh_TW
dc.subject	軟剪枝	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	模型最佳化	zh_TW
dc.subject	神經網路架構搜尋	zh_TW
dc.subject	可微分神經網路架構搜尋	zh_TW
dc.subject	模型壓縮	zh_TW
dc.subject	Deep Learning	en
dc.subject	Model Compression	en
dc.subject	Differentiable Architecture Search	en
dc.subject	Neural Architecture Search	en
dc.subject	Model Optimization	en
dc.subject	Model Pruning	en
dc.subject	Soft Pruning	en
dc.title	基於軟剪枝的可微分神經網路架構搜尋	zh_TW
dc.title	SP-DARTS: Soft Pruning Differentiable Architecture Search	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	雷欽隆;余承叡	zh_TW
dc.contributor.oralexamcommittee	Chin-Laung Lei;Cheng-Juei Yu	en
dc.subject.keyword	深度學習,模型最佳化,神經網路架構搜尋,可微分神經網路架構搜尋,模型壓縮,模型剪枝,軟剪枝,	zh_TW
dc.subject.keyword	Deep Learning,Model Optimization,Neural Architecture Search,Differentiable Architecture Search,Model Compression,Model Pruning,Soft Pruning,	en
dc.relation.page	36	-
dc.identifier.doi	10.6342/NTU202301914	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2023-07-26	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
dc.date.embargo-lift	2024-09-01	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	5.65 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。