基於查表法之高效神經網路推論

林詩哲; Shi-Zhe Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	郭大維	zh_TW
dc.contributor.advisor	Tei-Wei Kuo	en
dc.contributor.author	林詩哲	zh_TW
dc.contributor.author	Shi-Zhe Lin	en
dc.date.accessioned	2023-10-03T16:33:07Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-10-03	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-09	-
dc.identifier.citation	J. Choi, Z. Wang, S. Venkataramani, P. I.-J. Chuang, V. Srinivasan, and K. Gopalakrishnan. Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085, 2018. Q. Deng, Y. Zhang, M. Zhang, and J. Yang. Lacc: Exploiting lookup table-based fast and accurate vector multiplication in dram-based cnn accelerator. In Proceedings of the 56th Annual Design Automation Conference 2019, pages 1–6, 2019. Y. Jeon, B. Park, S. J. Kwon, B. Kim, J. Yun, and D. Lee. Biqgemm: matrix multiplication with lookup table for binary-coding-based quantized dnns. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. IEEE, 2020. A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius. Integer quantization for deep learning inference: Principles and empirical evaluation. arXiv preprint arXiv:2004.09602, 2020.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541	-
dc.description.abstract	隨著神經網路的複雜度持續增加，其計算和記憶體的需求也相應地成長。這些先進的模型中大量的參數加劇了推論延遲和能源消耗，尤其在邊緣裝置和即時系統等資源有限的環境中。為了應對這些挑戰，我們提出了一種基於查表法的高效神經網路推論策略。我們的策略決定了模型的位寬和向量維度，有效地平衡準確度、推論延遲以及空間占用。此外，我們將量化模型轉換為以查找表為基礎的模型，用簡單的查表法取代了複雜的內積運算，從而提高推論的效率。我們的實驗結果顯示，與傳統以浮點數運算的模型相比，我們的策略達到超過兩倍的速度提升和能源效率改善而且沒有犧牲太多空間。	zh_TW
dc.description.abstract	As the complexity of artificial neural networks escalates, so do their computational and memory requirements. Substantial parameter counts inherent in these advanced models exacerbate inference latency and energy consumption. This poses significant challenges, especially in resource-constrained environments such as edge devices and real-time systems. To counter these challenges, we introduce an efficient inferencing strategy for neural networks that utilizes a compact Lookup Table (LUT). Our method determines the optimal bit-widths for the model and vector dimensions, achieving a careful balance between accuracy, latency, and space overhead. Additionally, we convert the quantized model into an LUT-based model, replacing the computationally demanding dot product operations with simpler table lookups, thereby enhancing efficiency. Our experiments showcase over a 2x speedup and a similar improvement in energy efficiency in comparison to floating-point models, accomplished without substantial space overhead.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-10-03T16:33:07Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-10-03T16:33:07Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 致謝 ii 中文摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii 1 Introduction 1 2 Background and Motivation 4 2.1 Background 4 2.1.1 LUT for Dot Products 4 2.1.2 LUT Space Overhead 6 2.2 Motivation 8 3 LUT-based Inferencing Strategy 10 3.1 Hyperparameter Search 10 3.2 Model Conversion 12 3.2.1 Weight Vector Set Construction 13 3.2.2 Weight Index Assignment 14 3.2.3 LUT Precomputation 15 4 Evaluation 18 4.1 Experiment Setup 18 4.2 Experiment Results 19 4.2.1 Hyperparameter Search Results 19 4.2.2 Latency and Energy Consumption 20 4.2.3 Space Overhead Analysis 21 5 Conclusion 23 References 24	-
dc.language.iso	en	-
dc.title	基於查表法之高效神經網路推論	zh_TW
dc.title	Efficient Neural Network Inferencing with Table Lookup	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	張原豪	zh_TW
dc.contributor.coadvisor	Yuan-Hao Chang	en
dc.contributor.oralexamcommittee	施吉昇;洪士灝;徐慰中;王克中	zh_TW
dc.contributor.oralexamcommittee	Chi-Sheng Shih;Shih-Hao Hung;Wei-Chung Hsu;Keh-Chung Wang	en
dc.subject.keyword	查找表,神經網路,推論,內積,量化,	zh_TW
dc.subject.keyword	Lookup Table,Neural Network,Inference,Dot Product,Quantization,	en
dc.relation.page	25	-
dc.identifier.doi	10.6342/NTU202302336	-
dc.rights.note	未授權	-
dc.date.accepted	2023-08-10	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 目前未授權公開取用	2.59 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。