應用於深度神經網路推理之節能方法

鄭瑞軒; Rui-Xuan Zheng

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78771

Title:	應用於深度神經網路推理之節能方法 An Energy-Efficient Approach for Deep-Neural-Network Inference
Authors:	鄭瑞軒 Rui-Xuan Zheng
Advisor:	劉宗德 Tsung-Te Liu
Keyword:	機器學習,提前負值偵測,深度神經網路,卷積神經網路,具能源效益加速器, machine learning,early negative detection,ReLU,deep neural networks,convolutional neural networks,energy-efficient accelerator,
Publication Year :	2019
Degree:	碩士
Abstract:	近年來，隨著深度學習的迅速發展，深度神經網路已被廣泛應用於多個領域，包括計算機視覺、自然語言處理與生醫訊號分析。現階段已有大量針對深度神經網路進行最佳化的處理器設計被提出，其中零的跳過為一經常被使用的技術, 其利用ReLU所致的稀疏性，藉由跳過輸入為零的計算來節省能量消耗。除零的跳過外，最近被提出的提前負值偵測技術能更進一步地利用ReLU的特性與所致的稀疏性。透過提前偵測輸出為負值的非必要計算，此技術可取得額外的節省量，但必須使用位元序列的架構進行計算，也因此其節省效果受限於權重的位元數。本研究則提出基於閾值且不限於位元序列架構的通用型提前負值偵測方法，以及針對閾值的最佳化流程，此流程可在精準度維持於使用者指定範圍內之條件下，最小化計算量。在軟體層面，本研究之方法在精準度變異為0.09%的情況下，可節省31.97%的計算量；相較於既有方法，使用4位元權重的節省比例可增加31.05%。而在硬體層面,本研究透過實作一40奈米CMOS深度神經網路處理器來驗證所提出的方法，其中包含零的跳過與提前負值偵測方法的硬體設計。透過本研究之方法，此處理器可節省22.8% 的能量消耗而僅使測試精準度略降0.96%，並在0.81V、250MHz的操作條件下，取得1.04TOPS/W的能量效率。 Recently, deep neural networks (DNNs) have been widely used in fields including computer vision, natural language processing, and bio-signal analysis. Many processors have been proposed to improve efficiency of executing DNNs. In these works, zero skipping technique is commonly adopted to save energy by skipping zero-input operations due to ReLU. Apart from zero skipping, a newly introduced technique can further exploit ReLU-induced sparsity through early negative detection, i.e., detecting and skipping unnecessary negative-generating computations at early stage. However, the technique requires to be performed on bit-serial architecture and thus its capability is subject to bit number of weights. This work proposes a generic threshold-based approach realizing early negative detection without necessity of bit-serial scheme, together with a systematic procedure of threshold optimization that minimizes computations while keeping accuracy variation within an acceptable range via user-specified constraint. At software level, the proposed approach reduces 31.97% operations with 0.09% accuracy variation and outperforms the previous work by 31.05% computation reduction rate if using 4-bit weights. Moreover, a 40-nm CMOS reconfigurable DNN processor implemented with both conventional zero skipping technique and scheme for the proposed approach is designed for evaluation of effectiveness and overhead. The processor acquires 22.8% reduction of energy consumption solely by the proposed approach with 0.96% loss in test accuracy, and reaches an energy efficiency of 1.04TOPS/W at 0.81V, 250MHz.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78771
DOI:	10.6342/NTU201901480
Fulltext Rights:	未授權
metadata.dc.date.embargo-lift:	2024-07-16
Appears in Collections:	電子工程學研究所

Files in This Item:

File	Size	Format
ntu-107-2.pdf Restricted Access	4.87 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets