Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor郭大維zh_TW
dc.contributor.advisorTei-Wei Kuoen
dc.contributor.author林詩哲zh_TW
dc.contributor.authorShi-Zhe Linen
dc.date.accessioned2023-10-03T16:33:07Z-
dc.date.available2023-11-09-
dc.date.copyright2023-10-03-
dc.date.issued2023-
dc.date.submitted2023-08-09-
dc.identifier.citationJ. Choi, Z. Wang, S. Venkataramani, P. I.-J. Chuang, V. Srinivasan, and K. Gopalakrishnan. Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085, 2018.
Q. Deng, Y. Zhang, M. Zhang, and J. Yang. Lacc: Exploiting lookup table-based fast and accurate vector multiplication in dram-based cnn accelerator. In Proceedings of the 56th Annual Design Automation Conference 2019, pages 1–6, 2019.
Y. Jeon, B. Park, S. J. Kwon, B. Kim, J. Yun, and D. Lee. Biqgemm: matrix multiplication with lookup table for binary-coding-based quantized dnns. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. IEEE, 2020.
A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius. Integer quantization for deep learning inference: Principles and empirical evaluation. arXiv preprint arXiv:2004.09602, 2020.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90541-
dc.description.abstract隨著神經網路的複雜度持續增加,其計算和記憶體的需求也相應地成長。這些先進的模型中大量的參數加劇了推論延遲和能源消耗,尤其在邊緣裝置和即時系統等資源有限的環境中。為了應對這些挑戰,我們提出了一種基於查表法的高效神經網路推論策略。我們的策略決定了模型的位寬和向量維度,有效地平衡準確度、推論延遲以及空間占用。此外,我們將量化模型轉換為以查找表為基礎的模型,用簡單的查表法取代了複雜的內積運算,從而提高推論的效率。我們的實驗結果顯示,與傳統以浮點數運算的模型相比,我們的策略達到超過兩倍的速度提升和能源效率改善而且沒有犧牲太多空間。zh_TW
dc.description.abstractAs the complexity of artificial neural networks escalates, so do their computational and memory requirements. Substantial parameter counts inherent in these advanced models exacerbate inference latency and energy consumption. This poses significant challenges, especially in resource-constrained environments such as edge devices and real-time systems. To counter these challenges, we introduce an efficient inferencing strategy for neural networks that utilizes a compact Lookup Table (LUT). Our method determines the optimal bit-widths for the model and vector dimensions, achieving a careful balance between accuracy, latency, and space overhead. Additionally, we convert the quantized model into an LUT-based model, replacing the computationally demanding dot product operations with simpler table lookups, thereby enhancing efficiency. Our experiments showcase over a 2x speedup and a similar improvement in energy efficiency in comparison to floating-point models, accomplished without substantial space overhead.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-10-03T16:33:07Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-10-03T16:33:07Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
致謝 ii
中文摘要 iii
Abstract iv
Contents v
List of Figures vii
List of Tables viii
1 Introduction 1
2 Background and Motivation 4
2.1 Background 4
2.1.1 LUT for Dot Products 4
2.1.2 LUT Space Overhead 6
2.2 Motivation 8
3 LUT-based Inferencing Strategy 10
3.1 Hyperparameter Search 10
3.2 Model Conversion 12
3.2.1 Weight Vector Set Construction 13
3.2.2 Weight Index Assignment 14
3.2.3 LUT Precomputation 15
4 Evaluation 18
4.1 Experiment Setup 18
4.2 Experiment Results 19
4.2.1 Hyperparameter Search Results 19
4.2.2 Latency and Energy Consumption 20
4.2.3 Space Overhead Analysis 21
5 Conclusion 23
References 24
-
dc.language.isoen-
dc.subject神經網路zh_TW
dc.subject內積zh_TW
dc.subject推論zh_TW
dc.subject查找表zh_TW
dc.subject量化zh_TW
dc.subjectLookup Tableen
dc.subjectDot Producten
dc.subjectInferenceen
dc.subjectNeural Networken
dc.subjectQuantizationen
dc.title基於查表法之高效神經網路推論zh_TW
dc.titleEfficient Neural Network Inferencing with Table Lookupen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.coadvisor張原豪zh_TW
dc.contributor.coadvisorYuan-Hao Changen
dc.contributor.oralexamcommittee施吉昇;洪士灝;徐慰中;王克中zh_TW
dc.contributor.oralexamcommitteeChi-Sheng Shih;Shih-Hao Hung;Wei-Chung Hsu;Keh-Chung Wangen
dc.subject.keyword查找表,神經網路,推論,內積,量化,zh_TW
dc.subject.keywordLookup Table,Neural Network,Inference,Dot Product,Quantization,en
dc.relation.page25-
dc.identifier.doi10.6342/NTU202302336-
dc.rights.note未授權-
dc.date.accepted2023-08-10-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
  未授權公開取用
2.59 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved