基於非揮發記憶體之卷積神經網路加速器的展開方法

Yueh-Han Wu; 巫岳翰

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21518

標題:	基於非揮發記憶體之卷積神經網路加速器的展開方法 A Kernel Unfolding Approach over NVM Crossbar Accelerators for Convolutional Neural Networks
作者:	Yueh-Han Wu 巫岳翰
指導教授:	郭大維
共同指導教授:	張原豪
關鍵字:	非揮發記憶體,卷積神經網路,加速, Nonvalitile,NVM,Convolutional Neural Network,CNN,crossbar,
出版年 :	2019
學位:	碩士
摘要:	近期有許多研究針對用於卷積神經網絡（CNN）的加速器，以得到比目前范紐曼（Von-Neumann）架構更佳的性能。隨著計算能力的不斷增長且記憶體速度的增長相對較慢，運算單元和記憶體設備之間的資料搬移已成為限制這些加速器性能的瓶頸。為了減少資料的搬移，Processing-In-Memory（PIM）架構被許多論文研究並提倡。然而，PIM 架構只能減少晶片外與晶片之間資料的搬移。晶片內部暫存空間與運算單元的搬移仍然存在，為了進一步減少該區段資料的搬移，此篇論文提出了一種卷積核（Kernel）展開方法，用晶片計算能力換取輸入特徵圖片（Input Feature Map）資料的利用效率，以減少輸入資料的搬移量，而不會重複輸入相同的資料。此篇論文提出的方法分別在時脈週期的減少、執行時間降低，以及輸入資料量減少的方面，分別得到了 16.2 倍，1.62倍和19.2 倍的改進程度。 A number of recent woks aimed to design accelerators for Convolutional Neural Networks (CNNs) to improve the performance of current Von-Neumann architecture. As computation power keeps growing and memory speed grows relatively slow, data movement between computing units and memory devices has become bottleneck that limit the performance of these accelerators. To eliminate the data movement, Processing-In-Memory (PIM) architecture is widely advocated. However, PIM can only decrease the data movement from off-chip. To further decrease the on-chip data movement, we propose a kernel-unfolding approach to trade off computation power for lower input data movement amount by fully utilize the input feature map data without overlapped data be input. The proposed approach yields improvements of 16.2×, 1.62× and 19.2× in cycle used, execution time, input data amount, respectively.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21518
DOI:	10.6342/NTU201901620
全文授權:	未授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	1.71 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。