基於查表法之高效神經網路推論

林詩哲; Shi-Zhe Lin

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541

Title:	基於查表法之高效神經網路推論 Efficient Neural Network Inferencing with Table Lookup
Authors:	林詩哲 Shi-Zhe Lin
Advisor:	郭大維 Tei-Wei Kuo
Co-Advisor:	張原豪 Yuan-Hao Chang
Keyword:	查找表,神經網路,推論,內積,量化, Lookup Table,Neural Network,Inference,Dot Product,Quantization,
Publication Year :	2023
Degree:	碩士
Abstract:	隨著神經網路的複雜度持續增加，其計算和記憶體的需求也相應地成長。這些先進的模型中大量的參數加劇了推論延遲和能源消耗，尤其在邊緣裝置和即時系統等資源有限的環境中。為了應對這些挑戰，我們提出了一種基於查表法的高效神經網路推論策略。我們的策略決定了模型的位寬和向量維度，有效地平衡準確度、推論延遲以及空間占用。此外，我們將量化模型轉換為以查找表為基礎的模型，用簡單的查表法取代了複雜的內積運算，從而提高推論的效率。我們的實驗結果顯示，與傳統以浮點數運算的模型相比，我們的策略達到超過兩倍的速度提升和能源效率改善而且沒有犧牲太多空間。 As the complexity of artificial neural networks escalates, so do their computational and memory requirements. Substantial parameter counts inherent in these advanced models exacerbate inference latency and energy consumption. This poses significant challenges, especially in resource-constrained environments such as edge devices and real-time systems. To counter these challenges, we introduce an efficient inferencing strategy for neural networks that utilizes a compact Lookup Table (LUT). Our method determines the optimal bit-widths for the model and vector dimensions, achieving a careful balance between accuracy, latency, and space overhead. Additionally, we convert the quantized model into an LUT-based model, replacing the computationally demanding dot product operations with simpler table lookups, thereby enhancing efficiency. Our experiments showcase over a 2x speedup and a similar improvement in energy efficiency in comparison to floating-point models, accomplished without substantial space overhead.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541
DOI:	10.6342/NTU202302336
Fulltext Rights:	未授權
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-111-2.pdf Restricted Access	2.59 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets