高效的模型架構生成基於可逆神經網路應用於神經網路架構搜索

陳冠頴; Guan-Ying Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88159

標題:	高效的模型架構生成基於可逆神經網路應用於神經網路架構搜索 Efficient Neural Architecture Generation with an Invertible Neural Network for Neural Architecture Search
作者:	陳冠頴 Guan-Ying Chen
指導教授:	周承復 Cheng-Fu Chou
關鍵字:	神經網路架構搜索,機器學習,圖神經網路,變分自編碼器,生程式模型,可逆神經網路, Neural Architecture Search,Machine Learning,Graph Neural Network,Variational Autoencoder,Generative Model,Invertible Neural Network,
出版年 :	2023
學位:	碩士
摘要:	近年來，如何快速且自動化地找出預測準確度較佳的神經網路架構已備受重視。在實作上，要得到每個神經網路架構的真實預測表現是非常耗時且消耗運算資源的，因為需要在給定的資料集上實際訓練每個神經模型來取得。如何在有限的時間以及已標記好預測準確度的神經網路架構資料下，尋找出預測準確度較佳的神經網路架構是首要目標。為減少搜尋時間、運算資源，使用較少的已標記資料是優先考量的方法，因此，使用代理模型來預測每個神經網路架構準確度的方式逐漸受到採用，配合基因演算法或者最佳化演算法，例如：Local Search、Random Search、Bayesian Optimization，在預先定義好的架構搜尋空間中，搜尋出預測準確度較高的神經網路架構。然而，部分的做法只使用了已標記的訓練資料，忽略了整個搜尋空間中未標記準確度的神經網路架構資料也可以被有效利用。本篇論文提出的方法基於可逆神經網路（invertible neural network）以及變分自編碼器（variational autoencoder），由給定的預測準確度來回推出可能的神經網路架構。此方法可有效利用整個搜尋空間中，未標記實際預測準確度的神經網路架構當作預訓練（pre-training）資料，利用自監督式學習（self-supervised learning）的技術來訓練變分自編碼器。接著，我們能利用變分自編碼器中的Encoder來將神經網路架構，由離散的空間轉換到連續的平坦空間，再使用可逆神經網路做回歸建模任務（regression）的訓練，利用神經網路模型架構的平坦空間表示（latent representation）預測出該模型架構的真實準確度。最後，利用可逆神經網路的特性，我們能夠逆推出表現較好的模型架構，並且搭配本方法也有代理模型的特性，可預測出神經網路架構的準確度，挑選出可能的候選架構，經過每一輪的逆推與再訓練疊代後，我們的模型最終能回推出表現較佳的神經網路架構，達到找尋出準確度較高的神經網路架構的目標。在實驗中，我們將提出的方法做效能評估，利用Neural Architecture Search (NAS)領域常用來比較的公開的神經網路架構搜尋評估庫（benchmarks）與其他方法比較，這些公開的評估庫讓NAS的研究有一個可以公平比較的平台，包含：NAS-Bench-101、NAS-Bench-201，根據實驗結果，我們提出的做法可以在有限的已標記資料下達到很好的表現，能夠搜尋出表現較高的神經網路架構，在與相同領域論文的實驗結果比較後，展現出我們的方法與當今最先進的做法（state-of-the-art）是可以相比擬的。 In recent years, there has been an increasing fascination with the efficient and automated discovery of high-performing neural architectures. However, evaluating performance of each architecture is time-consuming as it requires actual training on a prepared dataset. Therefore, the primary goal is to search for well-performing neural architectures within a limited set of architectures that have been evaluated. To reduce the need for actual training and labeled data, using surrogate models to predict the performance of neural architectures has become popular. This approach is often coupled with genetic algorithms or optimization algorithms such as Local Search (LS), Random Search (RS) and Bayesian Optimization (BO) to identify better neural architectures within the predefined search space. However, it has been observed that some methods only use labeled training data and do not make full use of available unlabeled data, i.e., all untrained architectures themselves in the search space. Our method is based on Invertible Neural Network (INN) to inversely map the neural architecture from its performance. This method makes full use of the unlabeled data (untrained neural architectures) within the entire search space to train a variational autoencoder with a self-supervised learning mechanism. The variational autoencoder transforms the architecture into a latent space. Then, the invertible neural network performs as a regressor to convert the latent representation of the architecture into its performance. Finally, the invertible neural network can be used to infer the latent representation of best-performing architectures. Coupled with the surrogate model property of our method, it can predict the performance of candidate architectures and add them to training data. Our model can iteratively learn to infer and inverse to better-performing neural architectures. Our method is evaluated on publicly widely used benchmarks for NAS which help us to compare our work with other approaches, including NAS-Bench-101, NAS-Bench-201. The results demonstrate that our method can search for better-performing neural architectures with limited evaluated architectures and comparable with the state-of-the-art approaches.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88159
DOI:	10.6342/NTU202300924
全文授權:	同意授權(全球公開)
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	3.28 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。