請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/67115| 標題: | 以知識迭代為規範的深度模型訓練 Distilling from the Past: Self-Distillation by Using Epoch Regularization |
| 作者: | Jo-Han Hsu 許若漢 |
| 指導教授: | 莊永裕(Chuang-Yung Yu) |
| 共同指導教授: | 林彥宇(Yen-Yu Lin) |
| 關鍵字: | 影像辨識,深度學習,卷積類神經網路, image classification,deep learning,convolutional neural networks, |
| 出版年 : | 2017 |
| 學位: | 碩士 |
| 摘要: | 各種規範類神經網路的方法於先前的論文中被驗證能在監督式學習中提升相當的效果。而萃取知識(knowledge distillation)也是其他的一種,跟傳統訓練類神經網路的方法不同的是,除了用資料集原先就有提供的類別(hard targets)外,他們額外拿其他較為強大的模型所提供的軟性類別(soft targets)來強化學習,進而達到很好的效果,然而大部分情況下,那些較為強大的模型是不存在的。我們提出一種以知識迭代為規範的深度模型訓練的方法(sDER),讓模型可以藉由過去的訓練中提取所學的知識來規範自我,而我們的方法也解決了萃取知識需要其他較為強大的模型的缺點。
我們的優化過程會同時考慮到當前的硬性標籤及過往的軟性標籤,實驗中證明這樣的方法是相當有效的。舉例來說,在影像辨識的領域中著名的CIFAR-10資料集裡,用我們的方法來訓練一個只有110層的類神經網路甚至可以表現得比1202層的還要好,而我們的方法不需要太多額外的時間及資源且相當簡單,而且可以應用在不同的領域上面。 Regularizing neural networks has been shown to improve the performance in supervised learning. For example, Hinton et al. proposed the Knowledge Distillation to regularize models with soft targets provided by the other strong models. However, the assumption of having powerful teacher models does not hold for many problems. This thesis proposes a method called self-distillation by the epoch regularization (sDER) which distils the information from the past epoch models one must have during training. An epoch model is simply the model obtained after training with an epoch. By selecting several epoch models, we form a good ensemble model for regularization. Through optimization considering both fitting well with the hard target and epoch regularization, the performance can be improved. For example, on CIFAR-10, with the proposed sDER method, a shallower network with 110 layers outperforms its counterpart with 1,202 layers. Furthermore, this versatile method can improve many different models' performance without much extra overhead for training. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/67115 |
| DOI: | 10.6342/NTU201702904 |
| 全文授權: | 有償授權 |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-106-1.pdf 未授權公開取用 | 932.09 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
