Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59095
Title: | 高效之類神經網路自動延伸演算法 An Adaptive Layer Expansion Algorithm for Efficient Training of Deep Neural Networks |
Authors: | Yi-Long Chen 陳鎰龍 |
Advisor: | 劉邦鋒(Pangfeng Liu) |
Keyword: | 機器學習,深度學習,卷積神經網路,模型壓縮,梯度變化, machine learning,deep learning,convolution neural network,model compression,gradient variation, |
Publication Year : | 2020 |
Degree: | 碩士 |
Abstract: | 深度學習網路的加大使得準確率得以提升,但是其過多的參數量以及訓練時間卻使得網路在資源受限的裝置上無法運行。 在此篇論文中,我們提出了一個演算法自動延伸之學習網路用來解決訓練時間以及過量參數的問題。 該演算法在初期使用較小的模型做訓練,接著在每一次餵完所有的資料後,選擇梯度最不穩定的類神經層來增加類神經元。 實驗結果指出,使用此篇論文提出的方法能夠節省不少的運算時間,並且透過提早結束的機制,不僅能夠更大幅度的降低運算量,還能夠壓縮最後產生的模型大小。 In this paper, we propose an em adaptive layer expansion algorithm to reduce the training time of deep neural networks without noticeable loss of accuracy. Neural networks have become deeper and wider to improve accuracy. The size of such networks makes them time consuming to train. Hence, we propose an adaptive layer expansion algorithm that reduces training time by dynamically adding nodes in where is necessary, so as to improve the training efficiency while not losing accuracy. We start with a smaller model of only a fraction of parameters of the original model, then train the network and add nodes to specific layers determined by the variance of gradients. The algorithm repeatedly adds nodes until a threshold is reached, and trains model until the accuracy converges. The experiment results indicate that our algorithm only uses a quarter of computation time of a full model, and achieves 64.59\% accuracy on MobileNet with dataset CIFAR100, which is only 2\% less than 66.2\% accuracy form a full model. The algorithm stops adding nodes when it has only half of the parameters of the origin model. As a result this new model is good for fast inference in those environments where both the computation power and memory storage are very limited, such as mobile devices. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59095 |
DOI: | 10.6342/NTU202003448 |
Fulltext Rights: | 有償授權 |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
U0001-1408202016281000.pdf Restricted Access | 874.74 kB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.