整合式代理學習器用於神經架構搜索

黃奕誠; Yi-Cheng Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89239

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳祝嵩	zh_TW
dc.contributor.advisor	Chu-Song Chen	en
dc.contributor.author	黃奕誠	zh_TW
dc.contributor.author	Yi-Cheng Huang	en
dc.date.accessioned	2023-09-07T16:09:57Z	-
dc.date.available	2024-09-01	-
dc.date.copyright	2023-09-11	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-08	-
dc.identifier.citation	[1] M. S. Abdelfattah, A. Mehrotra, Ł. Dudziak, and N. D. Lane. Zero-cost proxies for lightweight nas. In ICLR, 2021. [2] Y. Akhauri, J. P. Munoz, N. Jain, and R. Iyer. Evolving zero cost proxies for neural architecture scoring. arXiv preprint arXiv:2209.07413, 2022. [3] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: A next-generation hyperparameter optimization framework. In SIGKDD, 2019. [4] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. In NeurIPS, 2011. [5] T. C. P. Chau, L. Dudziak, M. S. Abdelfattah, R. Lee, H. Kim, and N. D. Lane. Brp-nas: Prediction-based nas using gcns. NeurIPS, 2020. [6] W. Chen, X. Gong, and Z. Wang. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. In ICLR, 2021. [7] X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, K. Chen, Y. Tian, M. Yu, P. Vajda, and J. E. Gonzalez. Fbnetv3: Joint architecture-recipe search using predictor pretraining. In CVPR, 2021. [8] X. Dong and Y. Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020. [9] Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun. Single path one-shot neural architecture search with uniform sampling. In ECCV, 2020. [10] K. Jing, J. Xu, and P. Li. Graph masked autoencoder enhanced predictor for neural architecture search. In IJCAI, 2022. [11] T. Kipf and M. Welling. Variational graph auto-encoders. ArXiv, abs/1611.07308, 2016. [12] A. Krishnakumar, C. White, A. Zela, R. Tu, M. Safari, and F. Hutter. Nas-bench-suite-zero: Accelerating research on zero cost proxies. NeurIPS Datasets and Benchmarks, 2022. [13] N. Lee, T. Ajanthan, and P. Torr. SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY. In ICLR, 2019. [14] M. Lin, P. Wang, Z. Sun, H. Chen, X. Sun, Q. Qian, H. Li, and R. Jin. Zen-nas: A zero-shot nas for high-performance image recognition. In ICCV, 2021. [15] H. Liu, K. Simonyan, and Y. Yang. DARTS: Differentiable architecture search. In ICLR, 2019. [16] V. Lopes, S. Alirezazadeh, and L. A. Alexandre. Epe-nas: Efficient performance estimation without training for neural architecture search. In ICANN, 2021. [17] R. Luo, X. Tan, R. Wang, T. Qin, E. Chen, and T.-Y. Liu. Semi-supervised neural architecture search. NeurIPS, 2020. [18] R. Luo, F. Tian, T. Qin, E.-H. Chen, and T.-Y. Liu. Neural architecture optimization. In Advances in neural information processing systems, 2018. [19] J. Mellor, J. Turner, A. Storkey, and E. J. Crowley. Neural architecture search without training. In ICML, 2021. [20] X. Ning, C. Tang, W. Li, Z. Zhou, S. Liang, H. Yang, and Y. Wang. Evaluating efficient performance estimators of neural architectures. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, NeurIPS, 2021. [21] H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean. Efficient neural architecture search via parameters sharing. In ICML, 2018. [22] M. Ruchte, A. Zela, J. N. Siems, J. Grabocka, and F. Hutter. Naslib: A modular and flexible neural architecture search library. 2020. [23] H. Tanaka, D. Kunin, D. L. Yamins, and S. Ganguli. Pruning neural networks without any data by iteratively conserving synaptic flow. In NeurIPS, 2020. [24] Y. Tang, Y. Wang, Y. Xu, H. Chen, B. Shi, C. Xu, C. Xu, Q. Tian, and C. Xu. A semi-supervised assessor of neural architectures. In CVPR, 2020. [25] J. Turner, E. J. Crowley, M. O’Boyle, A. Storkey, and G. Gray. Blockswap: Fisher-guided block substitution for network compression on a budget. In ICLR, 2020. [26] C. Wang, G. Zhang, and R. Grosse. Picking winning tickets before training by preserving gradient flow. In ICLR, 2020. [27] C. Wei, C. Niu, Y. Tang, and J. Liang. Npenas: Neural predictor guided evolution for neural architecture search. IEEE transactions on neural networks and learning systems, 2020. [28] C. White, W. Neiswanger, S. Nolen, and Y. Savani. A study on encodings for neural architecture search. NeurIPS, 2020. [29] C. White, W. Neiswanger, and Y. Savani. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, 2021. [30] H. Xiao, Z. Wang, Z. Zhu, J. Zhou, and J. Lu. Shapley-nas: Discovering operation contribution for neural architecture search. In CVPR, 2022. [31] K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How powerful are graph neural networks? In ICLR, 2019. [32] Y. Xu, L. Xie, X. Zhang, X. Chen, G.J. Qi, Q. Tian, and H. Xiong. Pcdarts: Partial channel connections for memory-efficient architecture search. In ICLR, 2020. [33] S. Yan, Y. Zheng, W. Ao, X. Zeng, and M. Zhang. Does unsupervised architecture representation learning help neural architecture search? NeurIPS, 2020. [34] P. Ye, B. Li, Y. Li, T. Chen, J. Fan, and W. Ouyang. b-darts: Beta-decay regularization for differentiable architecture search. In CVPR, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89239	-
dc.description.abstract	零代價代理具備對於未經訓練的神經網路架構評斷優劣之能力以利神經網路搜尋，使其於最近數年備受關注。然而，目前並未有任一零代價代理可在不同架構搜尋空間、任務上皆達到足夠好的表現。為了完善發揮零代價代理的潛力及複數代理之間的互補關係，我們提出了—整合式代理學習器用於神經網路搜尋。我們的方法透過學習架構的自編碼器，將離散的架構搜尋任務轉移至連續、可微分的向量空間。我們利用零代價代理計算方便快速的優勢，另外以極低成本訓練一架構的零代價代理之預測器。在此之上，我們利用此預測器，在架構的潛空間中以「梯度上升法」進行架構搜尋，其效率更勝於傳統的隨機搜尋及演化搜尋法。我們的方法具有計算代價低廉之優勢，而搜尋得到的架構之表現也十分具競爭力，其中在ImageNet分類任務中取得76.5%的精準度，是目前在DARTS架構搜尋空間中的最佳表現。	zh_TW
dc.description.abstract	Recently, zero-cost proxies for neural architecture search (NAS) have attracted increasing attention and research since they allows us to discover top-performing neural networks through architecture scoring without requiring training a very large network (i.e., supernet) first. Thus, it can save significant computation resources and time to complete the search. However, to our knowledge, no single proxy works the best for different tasks and scenarios. To consolidate the strength of different proxies and to reduce search bias, we propose a novel Integrated Proxy Learner (IPL) to perform a highly efficient and effective gradient-based search over random-based and evolutionary-based methods. We first train an autoencoder for one-hot encoded neural architecture. Then, we append a multi-layer perceptron (MLP) to the encoder and train the MLP to predict different proxies scores simultaneously in multi-task learning paradigm using collected score data from multiple existing zero-cost proxies. At last, we can sample a new architecture by performing gradient ascent-based search with the learned encoder and the MLP scorer. We conduct extensive experiments on the search spaces of NAS-Bench-201, NAS-Bench-101 and DARTS in different datasets, achieving the SOTA performance (76.5% top-1 accuracy) on ImageNet. The results demonstrate the effectiveness of the proposed approach over other state-of-the-art NAS algorithms.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-07T16:09:57Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-09-07T16:09:57Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 iv Abstract v Contents vii List of Figures x List of Tables xi Chapter 1 Introduction 1 1.1 Introduction of NAS . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Introduction of Zero-Cost Proxy . . . . . . . . . . . . . . . . . . . . 2 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2 Related Work 5 2.1 Predictor-Based NAS . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Zero-Cost Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Architecture Embedding . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 3 Methodology 8 3.1 Zero-Cost Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Architecture Autoencoder . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.1 Architecture Encoder . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.2 Architecture Decoder . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.3 Training Objective of Autoencoder . . . . . . . . . . . . . . . . . . 12 3.3 Proxy Learner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Gradient Ascent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 4 Experiment 15 4.1 Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.1 NAS-Bench-201 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.2 NAS-Bench-101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.3 DARTS CNN search space . . . . . . . . . . . . . . . . . . . . . . 16 4.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Proxy Learner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.2 Proxy Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2.1 Details of Finding Appropriate Weights to Combine Different Proxy Scores . . . . . . . . . . . . . . . . . . . 18 4.2.3 Gradient Ascent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Evalutation Results on NAS-Bench-201 . . . . . . . . . . . . . . . . 20 4.4 Evalutation Results on NAS-Bench-101 . . . . . . . . . . . . . . . . 20 4.5 Evalutation Results on DARTS Space . . . . . . . . . . . . . . . . . 22 4.6 Ablation Studies and Discussions . . . . . . . . . . . . . . . . . . . 23 4.6.1 The rank correlation of the predicted proxy scores. . . . . . . . . . 23 4.6.2 Comparing with baseline search method utilizing Proxy-Learner. . . 26 4.6.3 Visualization of NAS-Bench-201’s architecture embedding space. . 27 4.6.4 Training Data Preprocessing According to Proxy Score Distributions of NAS-Bench-201 and DARTS Spaces . . . . . . . . . . . . . . . 28 Chapter 5 Conclusion 34 References 35	-
dc.language.iso	en	-
dc.subject	零代價代理	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	神經網路架構搜索	zh_TW
dc.subject	影像分類	zh_TW
dc.subject	Neural architecture search	en
dc.subject	Deep learning	en
dc.subject	Image classification	en
dc.subject	Zero-cost proxy	en
dc.title	整合式代理學習器用於神經架構搜索	zh_TW
dc.title	An Integrated Proxy Learner for Neural Architecture Search	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳駿丞;李宏毅;孫民	zh_TW
dc.contributor.oralexamcommittee	Jun-Cheng Chen;Hung-Yi Lee;Min Sun	en
dc.subject.keyword	深度學習,神經網路架構搜索,零代價代理,影像分類,	zh_TW
dc.subject.keyword	Deep learning,Neural architecture search,Zero-cost proxy,Image classification,	en
dc.relation.page	38	-
dc.identifier.doi	10.6342/NTU202303048	-
dc.rights.note	未授權	-
dc.date.accepted	2023-08-09	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 未授權公開取用	4.48 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。