深入探討使用聚合可逆神經網路之效能預測器於神經網路架構搜索的應用

楊大煒; Ta-Wei Yang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91611

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	周承復	zh_TW
dc.contributor.advisor	Cheng-Fu Chou	en
dc.contributor.author	楊大煒	zh_TW
dc.contributor.author	Ta-Wei Yang	en
dc.date.accessioned	2024-02-20T16:11:46Z	-
dc.date.available	2024-02-21	-
dc.date.copyright	2024-02-20	-
dc.date.issued	2024	-
dc.date.submitted	2024-01-30	-
dc.identifier.citation	[1] Paul Ginsparg. arXiv. arXiv, 1991. Retrieved from https://arxiv.org/ at 2023-12-12. [2] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278– 2324, November 1998. [3] N. P. Jouppi et al. In-datacenter performance analysis of a tensor processing unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pages 1–12, 2017. [4] Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-yao Huang, Zhihui Li, Xiaojiang Chen, and Xin Wang. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Computing Surveys, 54(4):Article 76, May 2022. 34 pages. [5] Artha Pratim Ray. Chatgpt: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3:121–154, 2023. [6] Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools. ArXiv, abs/2302.04761, 2023. [7] Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1):1997–2017, jan 2019. [8] Barret Zoph and Quoc V. Le. Neural architecture search with reinforcement learning. In 5th International Conference on Learning Representations, ICLR, 2017. [9] Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. Understanding and simplifying one-shot architecture search. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 550–559. PMLR, 10–15 Jul 2018. [10] Hanxiao Liu, Karen Simonyan, and Yiming Yang. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018. [11] Joe Mellor, Jack Turner, Amos Storkey, and Elliot J Crowley. Neural architecture search without training. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 7588–7598. PMLR, 18–24 Jul 2021. [12] Mohamed S. Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas D. Lane. Zero-Cost Proxies for Lightweight NAS. In International Conference on Learning Representations (ICLR), 2021. [13] Wei Wen, Hanxiao Liu, Yiran Chen, Hai Li, Gabriel Bender, and Pieter-Jan Kindermans. Neural predictor for neural architecture search. In European Conference on computer vision, pages 660–676. Springer, 2020. [14] Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang Wang, Zicheng Liu, Mei Chen, and Lu Yuan. Stronger nas with weaker predictors. Advances in Neural Information Processing Systems, 34:28904–28918, 2021. [15] Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model. IEEE transactions on neural networks, 20(1):61–80, 2008. [16] Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016. [17] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017. [18] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018. [19] Jovita Lukasik, David Friede, Arber Zela, Frank Hutter, and Margret Keuper. Smooth variational graph embeddings for efficient neural architecture search. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021. [20] Kun Jing, Jungang Xu, and Pengfei Li. Graph masked autoencoder enhanced predictor for neural architecture search. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 3114–3120, 2022. [21] Arber Zela, Julien Siems, Lucas Zimmer, Jovita Lukasik, Margret Keuper, and Frank Hutter. Surrogate nas benchmarks: Going beyond the limited search spaces of tabular nas benchmarks. arXiv preprint arXiv:2008.09777, 2020. [22] Shen Yan, Yu Zheng, Wei Ao, Xiao Zeng, and Mi Zhang. Does unsupervised architecture representation learning help neural architecture search? Advances in neural information processing systems, 33:12486–12498, 2020. [23] Xuan Rao, Bo Zhao, Xiaosong Yi, and Derong Liu. Cr-lso: Convex neural architecture optimization in the latent space of graph variational autoencoder with input convex neural networks. arXiv preprint arXiv:2211.05950, 2022. [24] Chris Ying, Aaron Klein, Eric Christiansen, Esteban Real, Kevin Murphy, and Frank Hutter. Nas-bench-101: Towards reproducible neural architecture search. In International conference on machine learning, pages 7105–7114. PMLR, 2019. [25] Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. In International Conference on Learning Representations (ICLR), 2020. [26] Liam Li and Ameet Talwalkar. Random search and reproducibility for neural architecture search. In Uncertainty in artificial intelligence, pages 367–377. PMLR, 2020. [27] Tom Den Ottelander, Arkadiy Dushatskiy, Marco Virgolin, and Peter AN Bosman. Local search is a remarkably strong baseline for neural architecture search. In Evolutionary Multi-Criterion Optimization: 11th International Conference, EMO 2021, Shenzhen, China, March 28–31, 2021, Proceedings 11, pages 465–479. Springer, 2021. [28] Colin White, Sam Nolen, and Yash Savani. Exploring the loss landscape in neural architecture search. In Uncertainty in Artificial Intelligence, pages 654–664. PMLR, 2021. [29] Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V. Le. Regularized evolution for image classifier architecture search. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/ IAAI’19/ EAAI’19. AAAI Press, 2019. [30] Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, and Eric P Xing. Neural architecture search with bayesian optimisation and optimal transport. Advances in neural information processing systems, 31, 2018. [31] Guan-Ying Chen. Efficient neural architecture generation with an invertible neural network for neural architecture search. Master’s thesis, National Taiwan University, 2023. Airiti Library. [32] Thomas N Kipf and Max Welling. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016. [33] Lynton Ardizzone, Jakob Kruse, Sebastian Wirkert, Daniel Rahner, Eric W Pel- legrini, Ralf S Klessen, Lena Maier-Hein, Carsten Rother, and Ullrich Köthe. Analyzing inverse problems with invertible neural networks. arXiv preprint arXiv:1808.04730, 2018. [34] D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal representations by error propagation. In Parallel Distributed Processing. Vol 1: Foundations. MIT Press, Cambridge, MA, 1986. [35] Dana H. Ballard. Modular learning in neural networks. In Proceedings of the sixth National conference on Artificial intelligence - Volume 1 (AAAI’87), pages 279–284. AAAI Press, 1987. [36] M.A. Kramer. Nonlinear principal component analysis using autoassociative neural networks. AIChE J., 37:233–243, 1991. [37] Cheng-Yuan Liou, Wei-Chen Cheng, Jiun-Wei Liou, and Daw-Ran Liou. Autoencoder for words. Neurocomputing, 139:84–96, 2014. [38] Jing Wang and Filip Biljecki. Unsupervised machine learning in urban studies: A systematic review of applications. Cities, 129:103925, 2022. [39] David Yarowsky. Unsupervised word sense disambiguation rivaling supervised methods. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 189–196, Cambridge, Massachusetts, USA, 1995. Association for Computational Linguistics. [40] Carl Doersch and Andrew Zisserman. Multi-task self-supervised visual learning. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2070–2079, 2017. [41] J. Zhai, S. Zhang, J. Chen, and Q. He. Autoencoder and its various variants. In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 415–419, 2018. [42] Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning (ICML ’08), pages 1096–1103, New York, NY, USA, 2008. Association for Computing Machinery. [43] Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11(3/1/2010):3371–3408, 2010. [44] Alireza Makhzani and Brendan J. Frey. k-sparse autoencoders. CoRR, abs/1312.5663, 2013. [45] M. Singh, D. Mishra, and M. Vanidevi. Sparse autoencoder for sparse code multiple access. In 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pages 353–358, 2021. [46] M.A. Kramer. Autoassociative neural networks. Computers and Chemical Engineering, 16(4):313–328, 1992. Neutral network applications in chemical engineering. [47] A. Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy Dvijotham, Sven Gowal, and Pushmeet Kohli. The autoencoding variational autoencoder. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS’20), pages 15077–15087. Curran Associates Inc., 2020. [48] Solomon Kullback and Richard A. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22:79–86, 1951. [49] Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009. [50] Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017. [51] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In Proceedings of the 7th International Conference on Learning Representations, pages 1–17, 2019. [52] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph attention networks. In ICLR 2018, 2017. [53] Fan Chung. Spectral Graph Theory. American Mathematical Society, 1997. [Original publication in 1992]. [54] Mikael Sunnåker, Alberto Giovanni Busetto, Elina Numminen, Jukka Corander, Matthieu Foll, and Christophe Dessimoz. Approximate bayesian computation. PLoS computational biology, 9(1):e1002803, 2013. [55] Richard David Wilkinson. Approximate bayesian computation (abc) gives exact results under the assumption of model error. Statistical Applications in Genetics and Molecular Biology, 12(2):129–141, 2013. [56] Jarno Lintusaari, Michael U. Gutmann, Ritabrata Dutta, Samuel Kaski, and Jukka Corander. Fundamentals and recent developments in approximate bayesian computation. Systematic Biology, 66(1):e66–e82, 2017. [57] Christian Robert and George Casella. Monte Carlo Statistical Methods. Springer, 2004. [58] Dani Gamerman and Hedibert F. Lopes. Markov Chain Monte Carlo: Stochastic simulation for Bayesian inference. Chapman and Hall/CRC, 2006. [59] J. R. Norris. Markov Chains. Cambridge University Press, Cambridge, 1997. [60] Se Yoon Lee. Gibbs sampler and coordinate ascent variational inference: A set-theoretical review. Communications in Statistics - Theory and Methods, 51(6):1549–1568, 2021. [61] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014. [62] P. Isola, J. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5967–5976, Los Alamitos, CA, USA, 2017. IEEE Computer Society. [63] Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learning structured output representation using deep conditional generative models. In Advances in Neural Information Processing Systems, pages 3483–3491, 2015. [64] Diederik P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Improved variational inference with inverse autoregressive flow. In Advances in Neural Information Processing Systems, pages 4743–4751, 2016. [65] Lynton Ardizzone, Carsten Lüth, Jakob Kruse, Carsten Rother, and Ullrich Köthe. Guided image generation with conditional invertible neural networks. arXiv preprint arXiv:1907.02392, 2019. [66] Esteban G. Tabak, Eric Vanden-Eijnden, et al. Density estimation by dual ascent of the log-likelihood. Communications in Mathematical Sciences, 8(1):217–233, 2010. [67] Esteban G. Tabak and Cristina V. Turner. A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2):145–164, 2013. [68] Brian L. Trippe and Richard E. Turner. Conditional density estimation with bayesian normalising flows. arXiv preprint arXiv:1802.04908, 2018. [69] Amal Saadallah, Maryam Tavakol, and Katharina Morik. An actor-critic ensemble aggregation model for time-series forecasting. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 2255–2260, 2021. [70] J. Khiari, L. Moreira-Matias, A. Shaker, B. Zenko, and S. Džeroski. Metabags: Bagged meta-decision trees for regression. In Joint European Conference on Machine Learning & Knowledge Discovery in Databases, pages 637–652. Springer, 2018. [71] Zhi-Hua Zhou. Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, 1st edition, 2012. [72] Ibomoiye Domor Mienye and Yanxia Sun. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access, 10:99129–99149, 2022. [73] Leo Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996. [74] Tin Kam Ho. Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282. IEEE, 1995. [75] Robert E Schapire. Explaining adaboost. Springer, 2013. [76] Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001. [77] A. S. Nemirovsky and D. B. Yudin. Problem Complexity and Method Efficiency in Optimization. John Wiley and Sons, Chichester, 1983. [78] Y. Nesterov. Primal–dual subgradient methods for convex problems. Technical Report 2005/67, Center for Operation Research and Econometrics, Louvain-la-Neuve, Belgium, 2005. Also available as CORE Discussion Paper. [79] Y. Nesterov. Primal-dual subgradient methods for convex problems. Mathematical Programming, 120:221–259, 2009. [80] A.B. Juditsky, A.V. Nazin, A.B. Tsybakov, et al. Recursive aggregation of estimators by the mirror descent algorithm with averaging. Problems of Information Transmission, 41:368–384, 2005. [81] A. Zela, J. N. Siems, and F. Hutter. Nas-bench-1shot1: Benchmarking and dissecting one-shot neural architecture search. ArXiv preprint arXiv:2001.10422, 2020. [82] J. N. Siems, L. Zimmer, A. Zela, J. Lukasik, M. Keuper, and F. Hutter. Nas-bench-301 and the case for surrogate benchmarks for neural architecture search. ArXiv preprint arXiv:2008.09777, 2020. [83] Nikita Klyuchnikov, Ilya Trofimov, Ekaterina Artemova, Mikhail Salnikov, Maxim Fedorov, and Evgeny Burnaev. Nas-bench-nlp: Neural architecture search benchmark for natural language processing. arXiv preprint arXiv:2006.07116, 2020. [84] Abhinav Mehrotra, Alberto Gil C. P. Ramos, Sourav Bhattacharya, Lukasz Dudziak, Ravichander Vipperla, Thomas Chau, Mohamed S. Abdelfattah, Samin Ishtiaq, and Nicholas Donald Lane. Nas-bench-asr: Reproducible neural architecture search for speech recognition. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. [85] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. [86] Xuanyi Dong, Lu Liu, Katarzyna Musial, and Bogdan Gabrys. NATS-Bench: Benchmarking nas algorithms for architecture topology and size. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021. doi:10.1109/TPAMI. 2021.3054824. [87] Seyed Saeed Changiz Rezaei, Fred X Han, Di Niu, Mohammad Salameh, Keith Mills, Shuo Lian, Wei Lu, and Shangling Jui. Generative adversarial neural architecture search. arXiv preprint arXiv:2105.09356, 2021. [88] Jovita Lukasik, Steffen Jung, and Margret Keuper. Learning where to look–generative nas is surprisingly efficient. In European Conference on Computer Vision, pages 257–273. Springer, 2022. [89] A. Juditsky, P. Rigollet, and A. B. Tsybakov. Learning by mirror averaging. arXiv Mathematics e-prints, page math/0511468, November 2005. [90] Austin Tripp, Erik Daxberger, and José Miguel Hernández-Lobato. Sample-efficient optimization in the latent space of deep generative models via weighted retraining. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 11259–11272. Curran Associates, Inc., 2020. [91] Colin White, Willie Neiswanger, Sam Nolen, and Yash Savani. A study on encodings for neural architecture search. Advances in neural information processing systems, 33:20309–20319, 2020. [92] Colin White, Willie Neiswanger, and Yash Savani. Bananas: Bayesian optimization with neural architectures for neural architecture search. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12):10293–10301, May 2021. [93] Michael Ruchte, Arber Zela, Julien Niklas Siems, Josif Grabocka, and Frank Hutter. Naslib: a modular and flexible neural architecture search library. https://github.com/automl/NASLib, 2020. [94] Yash Mehta, Colin White, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, and Frank Hutter. Nas-bench-suite: Nas evaluation is (now) surprisingly easy. arXiv preprint arXiv:2201.13396, 2022. [95] Daniele Grattarola and Cesare Alippi. Graph neural networks in tensorflow and keras with spektral [application notes]. Comp. Intell. Mag., 16(1):99–106, feb 2021. [96] Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, and Wanli Ouyang. β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search. arXiv e-prints, page arXiv:2203.01665, March 2022. [97] Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, and Junchi Yan. DARTS-: Robustly Stepping out of Performance Collapse Without Indicators. arXiv e-prints, page arXiv:2009.01027, September 2020. [98] Han Xiao, Ziwei Wang, Zheng Zhu, Jie Zhou, and Jiwen Lu. Shapley-nas: discovering operation contribution for neural architecture search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11892–11901, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91611	-
dc.description.abstract	神經架構搜索（NAS）演算法旨在尋找效能表現最佳的模型，其中候選模型的評估通常是計算最昂貴的任務。NAS-Bench數據集有助於為架構評估訓練性能預測模型，進而能夠實現NAS搜索算法的發展，以找到最優的模型架構。在本論文中，我們把NAS問題轉換為神經網絡模型架構之效能預測的反問題。主要思想是使用圖形變分自編碼器（GVAEs）和可逆神經網絡（INNs）用來組成我們的效能預測模型InvertNAS，InvertNAS模型可以讓神經架構映射到其效能，反方向也可以。這也就是說，GVAE能夠從取樣的神經架構中提取有用的低維潛在空間代碼，而聚合的INNs可以通過將這些潛在代碼映射到模型準確度來找出最終性能預測器。與其他NAS算法相比，結果顯示我們所提出的InvertNAS方法能夠於NASBench101上識別出排名第二的架構，在NASBench201上找到最優的架構，同時確保高查詢效率；我們相信，這種神經網絡架構之效能預測的反向對應方法是NAS的一個有前途與可用的解決方式。	zh_TW
dc.description.abstract	Neural Architecture Search (NAS) algorithms could find the optimal or best-performing model, with candidate model evaluation typically being the most computationally expensive task. The NAS-Bench dataset facilitates the training of performance prediction models for architecture evaluation, enabling the development of NAS search algorithms to find optimal model architectures. In this thesis, we transform a NAS problem into an inverse problem of performance prediction for neural network architectures. The main idea is to use Graph Variational Autoencoders (GVAEs) and Invertible Neural Networks (INNs) to construct our performance prediction model, InvertNAS, which could map the neural architectures to their performance and vice versa. That is, Graph VAE is able to extract useful low-dimensional latent space codes from the sampling neural architectures, while the aggregation INNs could figure out the final performance predictor by mapping these latent codes to model accuracy. Compared with other NAS algorithms, the results indicate that our proposed method, InvertNAS, could identify the rank-two architecture on NASBench101 and the optimal architecture on NASBench201, all while ensuring high query efficiency; we believe that such an inverse method of neural network performance prediction is a promising solution for NAS.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-02-20T16:11:46Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-02-20T16:11:46Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee. . . . . . . . . . . . . . . . . . . . . . . . i Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Denotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Chapter 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 2.1 Autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Graph Neural Network (GNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Invertible Neural Network (INN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4 Ensemble Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.1 Mirror Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.5 NAS Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.5.1 NASBench101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.2 NASBench201 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.6 Neural Network Search (NAS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30 Chapter 3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35 3.1.1 Graph Variational Autoencoder (GVAE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .36 3.1.2 Invertible Neural Network (INN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.1.3 Ensemble Learning (Aggregation Method) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2 Mapping Relationship for the Neural Network Architecture and its Model Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.3 Searching Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Chapter 4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1 NAS Benchmarks Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2 Training Details of InvertNAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3.1 NASBench101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3.2 NASBench201 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.4 Evaluation of the Performance Predictor for NAS . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.5 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.5.1 Candidates Generative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.5.2 Decoder Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.6 Visualizing Predicted Results in InvertNAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.6.1 Visualization Results of NASBench101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.6.1.1 CIFAR10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.6.2 Visualization Results of NASBench201 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.6.2.1 CIFAR10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.6.2.2 CIFAR100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.6.2.3 ImageNet16-120 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Chapter 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97	-
dc.language.iso	en	-
dc.subject	圖變分自編碼器	zh_TW
dc.subject	神經網路架構搜索演算法	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	集成學習	zh_TW
dc.subject	可逆神經網路	zh_TW
dc.subject	Invertible Nueral Network	en
dc.subject	Ensemble Learning	en
dc.subject	Machine Learning	en
dc.subject	Graph Variational Autoencoder	en
dc.subject	Neural Architecture Search	en
dc.title	深入探討使用聚合可逆神經網路之效能預測器於神經網路架構搜索的應用	zh_TW
dc.title	Deep Dive into the Application of Aggregated Invertible Neural Network as Performance Predictor in Neural Architecture Search	en
dc.type	Thesis	-
dc.date.schoolyear	112-1	-
dc.description.degree	博士	-
dc.contributor.oralexamcommittee	吳曉光;呂政修;廖婉君;陳文進;李明穗;蔡欣穆;黃志煒;蔡子傑	zh_TW
dc.contributor.oralexamcommittee	Hsiao-Kuang Wu;Jenq-Shiou Leu;Wan-Jiun Liao;Wen-Chin Chen;Ming-Sui Lee;Hsin-Mu Tsai;Chih-Wei Huang;Tzu-Chieh Tsai	en
dc.subject.keyword	神經網路架構搜索演算法,圖變分自編碼器,可逆神經網路,集成學習,機器學習,	zh_TW
dc.subject.keyword	Neural Architecture Search,Graph Variational Autoencoder,Invertible Nueral Network,Ensemble Learning,Machine Learning,	en
dc.relation.page	110	-
dc.identifier.doi	10.6342/NTU202400317	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2024-02-01	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊網路與多媒體研究所	-
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-112-1.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	7.39 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。