基於軟剪枝的可微分神經網路架構搜尋

馮泰哲; Tai-Che Feng

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88376

標題:	基於軟剪枝的可微分神經網路架構搜尋 SP-DARTS: Soft Pruning Differentiable Architecture Search
作者:	馮泰哲 Tai-Che Feng
指導教授:	王勝德 Sheng-De Wang
關鍵字:	深度學習,模型最佳化,神經網路架構搜尋,可微分神經網路架構搜尋,模型壓縮,模型剪枝,軟剪枝, Deep Learning,Model Optimization,Neural Architecture Search,Differentiable Architecture Search,Model Compression,Model Pruning,Soft Pruning,
出版年 :	2023
學位:	碩士
摘要:	近年來，可微分神經網路架構搜尋因其簡單性和高效的搜索能力而引起越來越多的關注。然而，這樣的搜尋方法卻有很大的機會遭遇過度擬合，而造成搜索出來的模型性能崩潰的問題。因此本篇論文我們提出SP-DARTS，一種基於軟剪枝的架構搜尋方法來解決這個問題。首先和以往的搜尋方法不同的地方在於，我們認為這種可微分的架構搜尋方式可以看做是一個模型剪枝的問題，將不重要的操作從擁有所有可能架構的超網中刪除來得出最終的模型，並且認為一般傳統的硬剪枝方法會讓搜索空間的容量在訓練的過程中逐漸減少而導致不穩定的搜尋結果，所以在這裡我們嘗試使用軟剪枝的方式進行訓練。其次，原本的DARTS在訓練結束後會選擇擁有最大操作係數的操作來形成最終的架構，但我們發現這樣的作法並不能夠真正的反映出操作之間的重要程度，因此我們直接測量每個操作對超網的影響來當作他們重要程度的依據。我們將方法實作在NAS-Bench-201的搜索空間上，實驗結果顯示SP-DARTS和漸進式的SRP-DARTS都是一個穩健的搜尋方法，搜尋出來的架構有良好的性能及穩定的結果。 Recently Differentiable Architecture Search (DARTS) has gained increasing attention due to its simplicity and efficient search capability. However, such search methods have a significant chance of encountering overfitting, which can result in the performance collapse problem of the discovered models. In this paper, we proposed SP-DARTS, a soft pruning-based differentiable architecture search method addressing this issue. Firstly, unlike previous search methods, we consider the differentiable architecture search process as a model pruning problem, which removes unimportant operations from the supernet that contains all possible architectures to obtain the final model. We also believe that the traditional hard pruning method would gradually reduce the capacity of the search space during training, leading to unstabilize search results, so we attempt to use the soft pruning approach in our training process. Secondly, the original DARTS method selects the operation with the maximum architecture parameter on each edge to form the final architecture after training. But we found that this approach cannot truly reflect their importance. Therefore, we estimate the impact on the supernet of each candidate operation instead directly looking at their architecture parameters. Finally, we implement our method on the NAS-Bench-201 search space, and the experimental results show that both SP-DARTS and the progressive approach SRP-DARTS are robust search methods that can search for architectures with good performance and stable results.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88376
DOI:	10.6342/NTU202301914
全文授權:	同意授權(限校園內公開)
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	5.65 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。