請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89239完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 陳祝嵩 | zh_TW |
| dc.contributor.advisor | Chu-Song Chen | en |
| dc.contributor.author | 黃奕誠 | zh_TW |
| dc.contributor.author | Yi-Cheng Huang | en |
| dc.date.accessioned | 2023-09-07T16:09:57Z | - |
| dc.date.available | 2024-09-01 | - |
| dc.date.copyright | 2023-09-11 | - |
| dc.date.issued | 2023 | - |
| dc.date.submitted | 2023-08-08 | - |
| dc.identifier.citation | [1] M. S. Abdelfattah, A. Mehrotra, Ł. Dudziak, and N. D. Lane. Zero-cost proxies for lightweight nas. In ICLR, 2021.
[2] Y. Akhauri, J. P. Munoz, N. Jain, and R. Iyer. Evolving zero cost proxies for neural architecture scoring. arXiv preprint arXiv:2209.07413, 2022. [3] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: A next-generation hyperparameter optimization framework. In SIGKDD, 2019. [4] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. In NeurIPS, 2011. [5] T. C. P. Chau, L. Dudziak, M. S. Abdelfattah, R. Lee, H. Kim, and N. D. Lane. Brp-nas: Prediction-based nas using gcns. NeurIPS, 2020. [6] W. Chen, X. Gong, and Z. Wang. Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. In ICLR, 2021. [7] X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, K. Chen, Y. Tian, M. Yu, P. Vajda, and J. E. Gonzalez. Fbnetv3: Joint architecture-recipe search using predictor pretraining. In CVPR, 2021. [8] X. Dong and Y. Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020. [9] Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun. Single path one-shot neural architecture search with uniform sampling. In ECCV, 2020. [10] K. Jing, J. Xu, and P. Li. Graph masked autoencoder enhanced predictor for neural architecture search. In IJCAI, 2022. [11] T. Kipf and M. Welling. Variational graph auto-encoders. ArXiv, abs/1611.07308, 2016. [12] A. Krishnakumar, C. White, A. Zela, R. Tu, M. Safari, and F. Hutter. Nas-bench-suite-zero: Accelerating research on zero cost proxies. NeurIPS Datasets and Benchmarks, 2022. [13] N. Lee, T. Ajanthan, and P. Torr. SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY. In ICLR, 2019. [14] M. Lin, P. Wang, Z. Sun, H. Chen, X. Sun, Q. Qian, H. Li, and R. Jin. Zen-nas: A zero-shot nas for high-performance image recognition. In ICCV, 2021. [15] H. Liu, K. Simonyan, and Y. Yang. DARTS: Differentiable architecture search. In ICLR, 2019. [16] V. Lopes, S. Alirezazadeh, and L. A. Alexandre. Epe-nas: Efficient performance estimation without training for neural architecture search. In ICANN, 2021. [17] R. Luo, X. Tan, R. Wang, T. Qin, E. Chen, and T.-Y. Liu. Semi-supervised neural architecture search. NeurIPS, 2020. [18] R. Luo, F. Tian, T. Qin, E.-H. Chen, and T.-Y. Liu. Neural architecture optimization. In Advances in neural information processing systems, 2018. [19] J. Mellor, J. Turner, A. Storkey, and E. J. Crowley. Neural architecture search without training. In ICML, 2021. [20] X. Ning, C. Tang, W. Li, Z. Zhou, S. Liang, H. Yang, and Y. Wang. Evaluating efficient performance estimators of neural architectures. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, NeurIPS, 2021. [21] H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean. Efficient neural architecture search via parameters sharing. In ICML, 2018. [22] M. Ruchte, A. Zela, J. N. Siems, J. Grabocka, and F. Hutter. Naslib: A modular and flexible neural architecture search library. 2020. [23] H. Tanaka, D. Kunin, D. L. Yamins, and S. Ganguli. Pruning neural networks without any data by iteratively conserving synaptic flow. In NeurIPS, 2020. [24] Y. Tang, Y. Wang, Y. Xu, H. Chen, B. Shi, C. Xu, C. Xu, Q. Tian, and C. Xu. A semi-supervised assessor of neural architectures. In CVPR, 2020. [25] J. Turner, E. J. Crowley, M. O’Boyle, A. Storkey, and G. Gray. Blockswap: Fisher-guided block substitution for network compression on a budget. In ICLR, 2020. [26] C. Wang, G. Zhang, and R. Grosse. Picking winning tickets before training by preserving gradient flow. In ICLR, 2020. [27] C. Wei, C. Niu, Y. Tang, and J. Liang. Npenas: Neural predictor guided evolution for neural architecture search. IEEE transactions on neural networks and learning systems, 2020. [28] C. White, W. Neiswanger, S. Nolen, and Y. Savani. A study on encodings for neural architecture search. NeurIPS, 2020. [29] C. White, W. Neiswanger, and Y. Savani. Bananas: Bayesian optimization with neural architectures for neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, 2021. [30] H. Xiao, Z. Wang, Z. Zhu, J. Zhou, and J. Lu. Shapley-nas: Discovering operation contribution for neural architecture search. In CVPR, 2022. [31] K. Xu, W. Hu, J. Leskovec, and S. Jegelka. How powerful are graph neural networks? In ICLR, 2019. [32] Y. Xu, L. Xie, X. Zhang, X. Chen, G.J. Qi, Q. Tian, and H. Xiong. Pcdarts: Partial channel connections for memory-efficient architecture search. In ICLR, 2020. [33] S. Yan, Y. Zheng, W. Ao, X. Zeng, and M. Zhang. Does unsupervised architecture representation learning help neural architecture search? NeurIPS, 2020. [34] P. Ye, B. Li, Y. Li, T. Chen, J. Fan, and W. Ouyang. b-darts: Beta-decay regularization for differentiable architecture search. In CVPR, 2022. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89239 | - |
| dc.description.abstract | 零代價代理具備對於未經訓練的神經網路架構評斷優劣之能力以利神經網路搜尋,使其於最近數年備受關注。然而,目前並未有任一零代價代理可在不同架構搜尋空間、任務上皆達到足夠好的表現。為了完善發揮零代價代理的潛力及複數代理之間的互補關係,我們提出了—整合式代理學習器用於神經網路搜尋。我們的方法透過學習架構的自編碼器,將離散的架構搜尋任務轉移至連續、可微分的向量空間。我們利用零代價代理計算方便快速的優勢,另外以極低成本訓練一架構的零代價代理之預測器。在此之上,我們利用此預測器,在架構的潛空間中以「梯度上升法」進行架構搜尋,其效率更勝於傳統的隨機搜尋及演化搜尋法。
我們的方法具有計算代價低廉之優勢,而搜尋得到的架構之表現也十分具競爭力,其中在ImageNet分類任務中取得76.5%的精準度,是目前在DARTS架構搜尋空間中的最佳表現。 | zh_TW |
| dc.description.abstract | Recently, zero-cost proxies for neural architecture search (NAS) have attracted increasing attention and research since they allows us to discover top-performing neural networks through architecture scoring without requiring training a very large network (i.e., supernet) first. Thus, it can save significant computation resources and time to complete the search. However, to our knowledge, no single proxy works the best for different tasks and scenarios. To consolidate the strength of different proxies and to reduce search bias, we propose a novel Integrated Proxy Learner (IPL) to perform a highly efficient and effective gradient-based search over random-based and evolutionary-based methods. We first train an autoencoder for one-hot encoded neural architecture. Then, we append a multi-layer perceptron (MLP) to the encoder and train the MLP to predict different proxies scores simultaneously in multi-task learning paradigm using collected score data from multiple existing zero-cost proxies. At last, we can sample a new architecture by performing gradient ascent-based search with the learned encoder and the MLP scorer. We conduct extensive experiments on the search spaces of NAS-Bench-201, NAS-Bench-101 and DARTS in different datasets, achieving the SOTA performance (76.5% top-1 accuracy) on ImageNet. The results demonstrate the effectiveness of the proposed approach over other state-of-the-art NAS algorithms. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-07T16:09:57Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2023-09-07T16:09:57Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
Acknowledgements iii 摘要 iv Abstract v Contents vii List of Figures x List of Tables xi Chapter 1 Introduction 1 1.1 Introduction of NAS . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Introduction of Zero-Cost Proxy . . . . . . . . . . . . . . . . . . . . 2 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2 Related Work 5 2.1 Predictor-Based NAS . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Zero-Cost Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Architecture Embedding . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 3 Methodology 8 3.1 Zero-Cost Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Architecture Autoencoder . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.1 Architecture Encoder . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.2 Architecture Decoder . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.3 Training Objective of Autoencoder . . . . . . . . . . . . . . . . . . 12 3.3 Proxy Learner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Gradient Ascent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 4 Experiment 15 4.1 Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.1 NAS-Bench-201 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.2 NAS-Bench-101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.3 DARTS CNN search space . . . . . . . . . . . . . . . . . . . . . . 16 4.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Proxy Learner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.2 Proxy Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2.1 Details of Finding Appropriate Weights to Combine Different Proxy Scores . . . . . . . . . . . . . . . . . . . 18 4.2.3 Gradient Ascent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Evalutation Results on NAS-Bench-201 . . . . . . . . . . . . . . . . 20 4.4 Evalutation Results on NAS-Bench-101 . . . . . . . . . . . . . . . . 20 4.5 Evalutation Results on DARTS Space . . . . . . . . . . . . . . . . . 22 4.6 Ablation Studies and Discussions . . . . . . . . . . . . . . . . . . . 23 4.6.1 The rank correlation of the predicted proxy scores. . . . . . . . . . 23 4.6.2 Comparing with baseline search method utilizing Proxy-Learner. . . 26 4.6.3 Visualization of NAS-Bench-201’s architecture embedding space. . 27 4.6.4 Training Data Preprocessing According to Proxy Score Distributions of NAS-Bench-201 and DARTS Spaces . . . . . . . . . . . . . . . 28 Chapter 5 Conclusion 34 References 35 | - |
| dc.language.iso | en | - |
| dc.subject | 零代價代理 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 神經網路架構搜索 | zh_TW |
| dc.subject | 影像分類 | zh_TW |
| dc.subject | Neural architecture search | en |
| dc.subject | Deep learning | en |
| dc.subject | Image classification | en |
| dc.subject | Zero-cost proxy | en |
| dc.title | 整合式代理學習器用於神經架構搜索 | zh_TW |
| dc.title | An Integrated Proxy Learner for Neural Architecture Search | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 111-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 陳駿丞;李宏毅;孫民 | zh_TW |
| dc.contributor.oralexamcommittee | Jun-Cheng Chen;Hung-Yi Lee;Min Sun | en |
| dc.subject.keyword | 深度學習,神經網路架構搜索,零代價代理,影像分類, | zh_TW |
| dc.subject.keyword | Deep learning,Neural architecture search,Zero-cost proxy,Image classification, | en |
| dc.relation.page | 38 | - |
| dc.identifier.doi | 10.6342/NTU202303048 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2023-08-09 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-111-2.pdf 未授權公開取用 | 4.48 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
