Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541
Title: | 基於查表法之高效神經網路推論 Efficient Neural Network Inferencing with Table Lookup |
Authors: | 林詩哲 Shi-Zhe Lin |
Advisor: | 郭大維 Tei-Wei Kuo |
Co-Advisor: | 張原豪 Yuan-Hao Chang |
Keyword: | 查找表,神經網路,推論,內積,量化, Lookup Table,Neural Network,Inference,Dot Product,Quantization, |
Publication Year : | 2023 |
Degree: | 碩士 |
Abstract: | 隨著神經網路的複雜度持續增加,其計算和記憶體的需求也相應地成長。這些先進的模型中大量的參數加劇了推論延遲和能源消耗,尤其在邊緣裝置和即時系統等資源有限的環境中。為了應對這些挑戰,我們提出了一種基於查表法的高效神經網路推論策略。我們的策略決定了模型的位寬和向量維度,有效地平衡準確度、推論延遲以及空間占用。此外,我們將量化模型轉換為以查找表為基礎的模型,用簡單的查表法取代了複雜的內積運算,從而提高推論的效率。我們的實驗結果顯示,與傳統以浮點數運算的模型相比,我們的策略達到超過兩倍的速度提升和能源效率改善而且沒有犧牲太多空間。 As the complexity of artificial neural networks escalates, so do their computational and memory requirements. Substantial parameter counts inherent in these advanced models exacerbate inference latency and energy consumption. This poses significant challenges, especially in resource-constrained environments such as edge devices and real-time systems. To counter these challenges, we introduce an efficient inferencing strategy for neural networks that utilizes a compact Lookup Table (LUT). Our method determines the optimal bit-widths for the model and vector dimensions, achieving a careful balance between accuracy, latency, and space overhead. Additionally, we convert the quantized model into an LUT-based model, replacing the computationally demanding dot product operations with simpler table lookups, thereby enhancing efficiency. Our experiments showcase over a 2x speedup and a similar improvement in energy efficiency in comparison to floating-point models, accomplished without substantial space overhead. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541 |
DOI: | 10.6342/NTU202302336 |
Fulltext Rights: | 未授權 |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-111-2.pdf Restricted Access | 2.59 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.