邊緣設備上的高效增量學習模型結構選擇

洪靖嘉; Jing-Jia Hung

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91187

Title:	邊緣設備上的高效增量學習模型結構選擇 Class-Aware Model Structure Selection for Efficient Incremental Learning on Edge Devices
Authors:	洪靖嘉 Jing-Jia Hung
Advisor:	楊佳玲 Chia-Lin Yang
Keyword:	剪枝,增量學習,邊緣計算, Pruning,Incremental learning,Edge Computing,
Publication Year :	2023
Degree:	碩士
Abstract:	增量學習是一種訓練方式，根據不斷在離散時間到來的新資料類別，逐漸增加深度學習模型的知識。在增量學習的訓練方式下，過去的類別中可以完全不保留或保留一小部分數據，防止失去從過去的數據類別中學到的知識，也就是所謂的災難性遺忘。此外，邊緣設備的普及已經導致人工智能應用廣泛整合到這些平台上。由於邊緣設備的計算資源有限，深度學習模型通常會進行壓縮。將增量學習和模型壓縮結合的兩種直接方法是：(1) 在每次增量學習迭代後壓縮原始模型或(2)在初始步驟中壓縮原始模型，然後在隨後的增量學習迭代中直接使用壓縮好的模型。然而，我們觀察到，對於方法（1），即使使用最先進的壓縮方法，只需要少量樣本就可以微調壓縮模型，也存在擦除先前獲得的知識的風險。另一方面，方法（2）會降低準確性，因為固定的模型結構無法足夠適應新類別，這不適用於IL的動態性質。作為回應，我們提出了首個以類別為導向的模型結構選擇方法。我們所提出的方法首先確定適合新舊類別的模型結構，然後使用「聯合」方法和精細修剪來保存過去的知識。實驗結果顯示，在我們所有的測試案例中，我們所提出的方法相對於方法（1）和方法（2）分別平均提高了5.6％和 1.5％的準確性。與方法（1）相比，訓練時間減少了40％。 Incremental learning (IL) is a training paradigm that gradually increases the "knowledge" of a Deep Learning (DL) model based on classes of new data constantly coming at discrete times. Under IL paradigm, none or only a small amount of data from the past classes can be kept, and it is crucial to prevent losing knowledge learned from past classes of data, also known as catastrophic forgetting. Moreover, the growing prevalence of edge devices has led to the widespread integration of AI-based applications on these platforms. Due to the limited computing resources of edge devices, DL models are commonly compressed. Two straightforward methods to combine IL training paradigm and model compression are either (a) compressing an original model after each IL iteration or (b) initially compressing the original model and using the compressed model for subsequent IL iterations. However, we observe that, with method (a), even if using the state-of-the-art compression method that needs only a small amount of samples to fine-tune the compressed model, risks erasing previously acquired knowledge. On the other hand, method (b) results in reduced accuracy because the fixed model structure is not adaptable enough to accommodate new classes, which is not suitable for the dynamic nature of IL. In response, we propose the first class-aware model structure selection method (CASS). The proposed CASS method first determine the suitable model structure for both new and old classes, then use "Union" method and fine-grained pruning to keep past knowledge well. The experimental results show that, among all our test cases, the proposed CASS method achieves 5.6% and 1.5% accuracy improvements on average compared to method (a) and method (b) respectively. While the time of training drops 40% compared to method (a).
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91187
DOI:	10.6342/NTU202301431
Fulltext Rights:	未授權
Appears in Collections:	資訊網路與多媒體研究所

Files in This Item:

File	Size	Format
ntu-112-1.pdf Restricted Access	2.34 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets