Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89006
Title: 利用細粒度結構化修剪對 CNN 模型進行有效推理
{Exploiting Fine-Grained Structured Pruning for Efficient Inference on CNN Model
Authors: 吳政鴻
Cheng-Hung Wu
Advisor: 劉邦鋒
Pangfeng Liu
Keyword: 機器學習,深度學習,卷積神經網路,模型壓縮,模型剪枝,動態規劃,細粒度,結構化修剪,TVM,
Machine learning,Deep learning,Model compression,Model pruning,Dynamic programming,Fine-grained,Structured Pruning,TVM,
Publication Year : 2023
Degree: 碩士
Abstract: 卷積神經網絡是一種革命性的深度學習技術,改變了計算機視覺領域。在現代的CNN模型中,卷積通常占據了大部分的計算時間。模型壓縮是一種在深度學習中用於減小神經網絡尺寸並同時保持其準確性的方法。權重修剪是一種從網絡中移除冗餘或不重要權重的方法。這些方法有助於減小神經網絡的尺寸和計算成本,同時保持其準確性。在本篇論文中,我們提出了一種動態規劃演算法,根據卷積層的執行時間和L1範數,在總時間預算內為每一層卷積層找到一個適當的稀疏比例。在確定每一層的稀疏比例後,我們修改了TVM並用它來生成使用掩碼指示要加載進行處理的數據的代碼。此外,我們提出了CHWN佈局,將數據批次的維度移到最內層維度,以消除最內層維度的變化大小,並使內存訪問模式連續。實驗結果顯示我們的方法相比於密集模型在ImageNet數據集上對VGG-16模型提升了0.35%的準確性和1.55倍的加速。
Convolutional neural network (CNN) is a deep learning technique that has revolutionized the field of computer vision. In modern CNN models, convolution typically accounts for the majority of the computation time. Model compression is a method used in deep learning to reduce the size of a neural network while preserving its accuracy. Weight pruning removes redundant or unimportant weights from the network.
These methods can help reduce the size and computational cost of neural networks while preserving their accuracy. In this work, we propose a a dynamic programming algorithm to find a good sparsity ratio for every layer individually under a total time budget based on the execution times and L1 norm of layers. After deciding the sparsity ratio for every layer, we modify TVM to generate code that uses a mask to indicate the data to load for processing. Furthermore, we propose the CHWN layout, where we move the dimension of the batch of data (N) to the innermost dimension to get rid of the varying size in the innermost dimension and make the memory access pattern contiguous. The experiment result shows that our scheme can achieve 0.35\% accuracy improvement and a 1.55x speedup on VGG-16 with the ImageNet dataset than the dense model.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89006
DOI: 10.6342/NTU202303337
Fulltext Rights: 同意授權(限校園內公開)
metadata.dc.date.embargo-lift: 2028-08-08
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-111-2.pdf
  Restricted Access
353.2 kBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved