請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 郭大維 | zh_TW |
dc.contributor.advisor | Tei-Wei Kuo | en |
dc.contributor.author | 林詩哲 | zh_TW |
dc.contributor.author | Shi-Zhe Lin | en |
dc.date.accessioned | 2023-10-03T16:33:07Z | - |
dc.date.available | 2023-11-09 | - |
dc.date.copyright | 2023-10-03 | - |
dc.date.issued | 2023 | - |
dc.date.submitted | 2023-08-09 | - |
dc.identifier.citation | J. Choi, Z. Wang, S. Venkataramani, P. I.-J. Chuang, V. Srinivasan, and K. Gopalakrishnan. Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085, 2018.
Q. Deng, Y. Zhang, M. Zhang, and J. Yang. Lacc: Exploiting lookup table-based fast and accurate vector multiplication in dram-based cnn accelerator. In Proceedings of the 56th Annual Design Automation Conference 2019, pages 1–6, 2019. Y. Jeon, B. Park, S. J. Kwon, B. Kim, J. Yun, and D. Lee. Biqgemm: matrix multiplication with lookup table for binary-coding-based quantized dnns. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. IEEE, 2020. A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius. Integer quantization for deep learning inference: Principles and empirical evaluation. arXiv preprint arXiv:2004.09602, 2020. | - |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541 | - |
dc.description.abstract | 隨著神經網路的複雜度持續增加,其計算和記憶體的需求也相應地成長。這些先進的模型中大量的參數加劇了推論延遲和能源消耗,尤其在邊緣裝置和即時系統等資源有限的環境中。為了應對這些挑戰,我們提出了一種基於查表法的高效神經網路推論策略。我們的策略決定了模型的位寬和向量維度,有效地平衡準確度、推論延遲以及空間占用。此外,我們將量化模型轉換為以查找表為基礎的模型,用簡單的查表法取代了複雜的內積運算,從而提高推論的效率。我們的實驗結果顯示,與傳統以浮點數運算的模型相比,我們的策略達到超過兩倍的速度提升和能源效率改善而且沒有犧牲太多空間。 | zh_TW |
dc.description.abstract | As the complexity of artificial neural networks escalates, so do their computational and memory requirements. Substantial parameter counts inherent in these advanced models exacerbate inference latency and energy consumption. This poses significant challenges, especially in resource-constrained environments such as edge devices and real-time systems. To counter these challenges, we introduce an efficient inferencing strategy for neural networks that utilizes a compact Lookup Table (LUT). Our method determines the optimal bit-widths for the model and vector dimensions, achieving a careful balance between accuracy, latency, and space overhead. Additionally, we convert the quantized model into an LUT-based model, replacing the computationally demanding dot product operations with simpler table lookups, thereby enhancing efficiency. Our experiments showcase over a 2x speedup and a similar improvement in energy efficiency in comparison to floating-point models, accomplished without substantial space overhead. | en |
dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-10-03T16:33:07Z No. of bitstreams: 0 | en |
dc.description.provenance | Made available in DSpace on 2023-10-03T16:33:07Z (GMT). No. of bitstreams: 0 | en |
dc.description.tableofcontents | 口試委員會審定書 i
致謝 ii 中文摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii 1 Introduction 1 2 Background and Motivation 4 2.1 Background 4 2.1.1 LUT for Dot Products 4 2.1.2 LUT Space Overhead 6 2.2 Motivation 8 3 LUT-based Inferencing Strategy 10 3.1 Hyperparameter Search 10 3.2 Model Conversion 12 3.2.1 Weight Vector Set Construction 13 3.2.2 Weight Index Assignment 14 3.2.3 LUT Precomputation 15 4 Evaluation 18 4.1 Experiment Setup 18 4.2 Experiment Results 19 4.2.1 Hyperparameter Search Results 19 4.2.2 Latency and Energy Consumption 20 4.2.3 Space Overhead Analysis 21 5 Conclusion 23 References 24 | - |
dc.language.iso | en | - |
dc.title | 基於查表法之高效神經網路推論 | zh_TW |
dc.title | Efficient Neural Network Inferencing with Table Lookup | en |
dc.type | Thesis | - |
dc.date.schoolyear | 111-2 | - |
dc.description.degree | 碩士 | - |
dc.contributor.coadvisor | 張原豪 | zh_TW |
dc.contributor.coadvisor | Yuan-Hao Chang | en |
dc.contributor.oralexamcommittee | 施吉昇;洪士灝;徐慰中;王克中 | zh_TW |
dc.contributor.oralexamcommittee | Chi-Sheng Shih;Shih-Hao Hung;Wei-Chung Hsu;Keh-Chung Wang | en |
dc.subject.keyword | 查找表,神經網路,推論,內積,量化, | zh_TW |
dc.subject.keyword | Lookup Table,Neural Network,Inference,Dot Product,Quantization, | en |
dc.relation.page | 25 | - |
dc.identifier.doi | 10.6342/NTU202302336 | - |
dc.rights.note | 未授權 | - |
dc.date.accepted | 2023-08-10 | - |
dc.contributor.author-college | 電機資訊學院 | - |
dc.contributor.author-dept | 資訊工程學系 | - |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 目前未授權公開取用 | 2.59 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。