Sparse ReRAM Engine: 聯合探索壓縮神經網路之權重與激活稀疏性

Tzu-Hsien Yang; 楊子賢

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74357

標題:	Sparse ReRAM Engine: 聯合探索壓縮神經網路之權重與激活稀疏性 Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks
作者:	Tzu-Hsien Yang 楊子賢
指導教授:	楊佳玲(Chia-Lin Yang)
關鍵字:	神經網路,稀疏性,可變電阻式記憶體,加速器架構, Neural network,sparsity,ReRAM,accelerator architecture,
出版年 :	2019
學位:	碩士
摘要:	利用神經網路模型的稀疏性以減少無效的計算為普遍使用的方法以達到高能效的深度神經網路推論加速器。然而由於緊密耦合的縱橫式結構，在基於可變電阻式記憶體之神經網路加速器下探索稀疏性尚為較少關注的部分。現有的可變電阻式記憶體之神經網路加速器架構研究假設整個縱橫式陣列可以在單一周期內啟動。然而考慮推論之精確度，矩陣-向量計算在實踐中必須以更小的粒度執行，稱之為操作單位(Operation Unit)。基於OU的架構創造了新的機會來探索深度神經網路的稀疏性。在本論文中，我們提出了第一個實際的稀疏可變電阻式記憶體引擎(Sparse ReRAM Engine)同時利用權重與激活的稀疏性。我們的評估顯示提出的方法可以有效的消除無效的計算，並且提供可觀的效能改善與能源節省。 Exploiting model sparsity to reduce ineffectual computation is a commonly used approach to achieve energy efficiency for DNN inference accelerators. However, due to the tightly coupled crossbar structure, exploiting sparsity for ReRAM-based NN accelerator is a less explored area. Existing architectural studies on ReRAM-based NN accelerators assume that an entire crossbar array can be activated in a single cycle. However, due to inference accuracy considerations, matrix-vector computation must be conducted in a smaller granularity in practice, called Operation Unit (OU). An OU-based architecture creates a new opportunity to exploit DNN sparsity. In this paper, we propose the first practical Sparse ReRAM Engine that exploits both weight and activation sparsity. Our evaluation shows that the proposed method is effective in eliminating ineffectual computation, and delivers significant performance improvement and energy savings.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74357
DOI:	10.6342/NTU201903060
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	1.64 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。