基於軟剪枝的可微分神經網路架構搜尋

馮泰哲; Tai-Che Feng

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88376

Title:	基於軟剪枝的可微分神經網路架構搜尋 SP-DARTS: Soft Pruning Differentiable Architecture Search
Authors:	馮泰哲 Tai-Che Feng
Advisor:	王勝德 Sheng-De Wang
Keyword:	深度學習,模型最佳化,神經網路架構搜尋,可微分神經網路架構搜尋,模型壓縮,模型剪枝,軟剪枝, Deep Learning,Model Optimization,Neural Architecture Search,Differentiable Architecture Search,Model Compression,Model Pruning,Soft Pruning,
Publication Year :	2023
Degree:	碩士
Abstract:	近年來，可微分神經網路架構搜尋因其簡單性和高效的搜索能力而引起越來越多的關注。然而，這樣的搜尋方法卻有很大的機會遭遇過度擬合，而造成搜索出來的模型性能崩潰的問題。因此本篇論文我們提出SP-DARTS，一種基於軟剪枝的架構搜尋方法來解決這個問題。首先和以往的搜尋方法不同的地方在於，我們認為這種可微分的架構搜尋方式可以看做是一個模型剪枝的問題，將不重要的操作從擁有所有可能架構的超網中刪除來得出最終的模型，並且認為一般傳統的硬剪枝方法會讓搜索空間的容量在訓練的過程中逐漸減少而導致不穩定的搜尋結果，所以在這裡我們嘗試使用軟剪枝的方式進行訓練。其次，原本的DARTS在訓練結束後會選擇擁有最大操作係數的操作來形成最終的架構，但我們發現這樣的作法並不能夠真正的反映出操作之間的重要程度，因此我們直接測量每個操作對超網的影響來當作他們重要程度的依據。我們將方法實作在NAS-Bench-201的搜索空間上，實驗結果顯示SP-DARTS和漸進式的SRP-DARTS都是一個穩健的搜尋方法，搜尋出來的架構有良好的性能及穩定的結果。 Recently Differentiable Architecture Search (DARTS) has gained increasing attention due to its simplicity and efficient search capability. However, such search methods have a significant chance of encountering overfitting, which can result in the performance collapse problem of the discovered models. In this paper, we proposed SP-DARTS, a soft pruning-based differentiable architecture search method addressing this issue. Firstly, unlike previous search methods, we consider the differentiable architecture search process as a model pruning problem, which removes unimportant operations from the supernet that contains all possible architectures to obtain the final model. We also believe that the traditional hard pruning method would gradually reduce the capacity of the search space during training, leading to unstabilize search results, so we attempt to use the soft pruning approach in our training process. Secondly, the original DARTS method selects the operation with the maximum architecture parameter on each edge to form the final architecture after training. But we found that this approach cannot truly reflect their importance. Therefore, we estimate the impact on the supernet of each candidate operation instead directly looking at their architecture parameters. Finally, we implement our method on the NAS-Bench-201 search space, and the experimental results show that both SP-DARTS and the progressive approach SRP-DARTS are robust search methods that can search for architectures with good performance and stable results.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88376
DOI:	10.6342/NTU202301914
Fulltext Rights:	同意授權(限校園內公開)
metadata.dc.date.embargo-lift:	2024-09-01
Appears in Collections:	電機工程學系

Files in This Item:

File	Size	Format
ntu-111-2.pdf Access limited in NTU ip range	5.65 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets