Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94298
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor吳安宇zh_TW
dc.contributor.advisorAn-Yeu Wuen
dc.contributor.author王婕恩zh_TW
dc.contributor.authorChieh-En Wangen
dc.date.accessioned2024-08-15T16:41:07Z-
dc.date.available2024-08-16-
dc.date.copyright2024-08-15-
dc.date.issued2024-
dc.date.submitted2024-08-05-
dc.identifier.citation[1] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Neural Information Processing Systems, 2020, pp. 6840-6851.
[2] A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in International conference on machine learning, PMLR, 2021, pp. 8162-8171.
[3] J. Ho et al., “Imagen video: High definition video generation with diffusion models,” arXiv preprint arXiv:2210.02303, 2022.
[4] A. Nichol et al., “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
[5] A. Creswell et al., “Generative adversarial networks: An overview,” in IEEE signal processing magazine, 2018, pp. 53-65.
[6] A. Vahdat and J. Kautz, “NVAE: A deep hierarchical variational autoencoder,” in Neural Information Processing Systems, 2020, pp. 19667-19679.
[7] H. Li et al., “Srdiff: Single image super-resolution with diffusion probabilistic models,” in Neurocomputing, 2022, pp. 47-59.
[8] S. Gao et al., “Implicit diffusion models for continuous super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10021-10030.
[9] J. Whang et al., “Deblurring via stochastic refinement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16293-16303.
[10] A. Lugmayr et al., “Repaint: Inpainting using denoising diffusion probabilistic models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11461-11471.
[11] C. Corneanu, R. Gadde, and A. M. Martinez, “LatentPaint: Image Inpainting in Latent Space With Diffusion Models,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 4334-4343.
[12] J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
[13] C. Lu et al., “DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps,” in Neural Information Processing Systems, 2022, pp. 3656-3667.
[14] R. Rombach et al., “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684-10695
[15] L. Liu et al., “Pseudo numerical methods for diffusion models on manifolds,” arXiv preprint arXiv:2202.09778, 2022.
[16] T. Salimans and J. Ho, “Progressive distillation for fast sampling of diffusion models,” arXiv preprint arXiv:2202.00512, 2022.
[17] F. Bao et al., “Analytic-DPM: An analytic estimate of the optimal reverse variance in diffusion probabilistic models,” arXiv preprint arXiv:2201.06503, 2022.
[18] G. Fang, X. Ma, and X. Wang, “Structural pruning for diffusion models,” in neural information processing systems, 2024, pp. 20-30.
[19] G. K. Tam et al., “Diffusion pruning for rapidly and robustly selecting global correspondences using local isometry,” in ACM Trans. Graph., 2014, 4-1.
[20] B.Jacob et al., “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2704-2716.
[21] Y. Shang et al., “Post-training quantization on diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1972-1981.
[22] X. Li et al., “Q-diffusion: Quantizing diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 17535-17545.
[23] Y. He et al., “PTQD: Accurate Post-Training Quantization for Diffusion Models,” arXiv preprint arXiv:2305.10657, 2023.
[24] J. So et al., “Temporal dynamic quantization for diffusion models,” in Neural Information Processing Systems, 2024, pp. 150-160.
[25] Y. Lin et al., “Fq-vit: Post-training quantization for fully quantized vision transformer,” arXiv preprint arXiv:2111.13824, 2021.
[26] Z. Yuan et al., “PTQ4ViT: Post-Training Quantization Framework for Vision Transformers with Twin Uniform Quantization,” in European conference on computer vision, 2022, pp. 191-207.
[27] Y. S. Tai, M. G. Lin, and A. Y. Wu, “TSPTQ-ViT: Two-scaled post-training quantization for vision transformer,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1-5.
[28] J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, “ImageNet: A large-scale hierarchical image database,” in CVPR, 2009, pp. 248-255.
[29] F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J Xiao, “Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop,” arXiv preprint arXiv:1506.03365, 2015.
[30] M. Heusel et al., “GANs trained by a two time-scale update rule converge to a local nash equilibrium,” in neural information processing systems, 2017, pp. 6626-6637.
[31] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, 2022.
[32] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10674-10685.
[33] C. Saharia et al., “Photorealistic text-to-image diffusion models with deep language understanding,” arXiv preprint arXiv:2205.11487, 2022.
[34] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, and D. Amodei, “Language models are few-shot learners,” in Advances in neural information processing systems, 2020, pp. 1877-1901.
[35] Y. Guo et al., “Network sketching: Exploiting binary structure in deep CNNs,” in CVPR, 2017, pp. 4056-4065.
[36] G. Ulhaq et al., “Efficient diffusion models for vision: A survey,” in arXiv preprint arXiv:2210.09292, 2022.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94298-
dc.description.abstract近期擴散模型(Diffusion models)因卓越的 成像能力,使其在影像生成及編輯等領域獲得了廣泛的認可,並在許多方面超越了傳統模型的方法。儘管它們表現出色,這些模型固有的龐大運算量和記憶體需求往往限制了它們在資源受限的環境以及可攜式裝備的實際應用。為了因應這些挑戰,後訓練量化( Post-Training Quantization, PTQ)成為了一種有效的解決方案。這個方法可以在不需要重新訓練的情況下減少所需的運算時間以及記憶體空間的需求。然而,由於擴散模型具備多重時間步長multiple-timestep 的特性,傳統的對稱後量化方法無法有效因應此特性往往造成明顯的量化誤差。
為了解決這一挑戰,我們提出了一項針對時間步長分群的量化方法,專門設計用於處理擴散模型中多個時間步長的特性 ,使得不同時間群組之間可以找到更適當的量化因子,進而提升量化的準確度。此外,我們發現SiLU層運算後的激活數值呈現非均勻分布,此分布可能導致劇烈的量化損失。為了減輕這一問題,我們引入了一種區域特定的量化策略,更準確地表示量化後的極值,從而提升模型的整體性能。通過整合這些創新方法,我們成功地開發出一種可應用於整體擴散模型的量化方法 。我們的實驗結果表明,該方法在進行8位量化後,仍能有效地保持 Frechet Inception Distance FID得分,突顯了此方法在實際應用中的潛力。
zh_TW
dc.description.abstractDiffusion models (DMs) have recently garnered widespread acclaim due to their superior imaging capabilities, which have demonstrated significant advancements over traditional methods. Despite their impressive performance, the extensive computational and memory demands inherent in these models often restrict their practical application on portable and resource-constrained devices. To mitigate these challenges, post-training quantization (PTQ) presents a promising solution that facilitates model compression and reduces runtime without necessitating retraining. However, traditional PTQ methods encounter substantial difficulties in addressing the unique time-variant distribution characteristics inherent in diffusion models.
To address this challenge, we propose a novel timestep-grouping PTQ approach specifically designed to manage the complexities associated with multiple timesteps in diffusion models. Additionally, we have identified that non-uniform post-SiLU (Sigmoid Linear Unit) activations can result in significant quantization losses. To mitigate this issue, we introduce a region-specific quantization strategy that more accurately represents extreme values following quantization, thereby enhancing the model's overall performance. By integrating these innovative methods, we have successfully developed a fully quantized diffusion model that is feasible for hardware implementation. Our experimental results demonstrate that the proposed approach effectively maintains the Frechet Inception Distance (FID) score even after 8-bit quantization, underscoring the potential of our method for practical applications.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-15T16:41:07Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-15T16:41:07Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsCONTENTS
誌謝
v
摘要
vii
ABSTRACT ix
CONTENTS xi
LIST OF FIGURES xv
LIST OF TABLES xviii
Chapter 1 Introduction 1
1.1 Background 1
1.1.1 Diffusion Model 1
1.1.2 Model Compression 1
1.2 Diffusion Model 3
1.2.1 Fundamental of Diffusion Model 3
1.2.2 Application of Diffusion Model 5
1.2.3 Metric for Generating Images 6
1.3 Motivation and Main Contributions 7
1.4 Thesis Organization 9
Chapter 2 Review of Quantization Approaches 11
2.1 Model Quantization 11
2.1.1 Quantization-Aware Training 11
2.1.2 Post-Training Quantization 12
2.2 Region-Specific Quantization 13
2.2.1 Twin-Uniform Quantization 14
2.2.2 Value-Aware Two-Scaled Scaling Factor 16
xii
2.3 Review of Post-Training Quantization for Diffusion Models .......................... 17
2.3.1 Normally Distributed Timestep Calibration (NDTC) ...................................................................... 18
2.3.2 Shortcut-Splitting Quantization ........................................................................................................................................ 19
2.3.3 Quantization Noise Disentangle ...................................................................................................................................... 20
2.4 Challenges of the Prior Works ............................................................................................................................ 21
2.5 Summary .................................................................................................................................................................................................. 21
Chapter 3 Proposed Timestep-Variant Region-Specific Scaling Factor ...................................... 23
3.1 Challenges of Non-uniform Distributions ........................................................................................ 23
3.2 Proposed Timestep-Variant Region-Specific Scaling Factor .......................... 25
3.3 Experimental Results .......................................................................................................................................................... 27
3.4 Hardware Mapping of TRSF ................................................................................................................................ 29
3.5 Summary .................................................................................................................................................................................................. 30
Chapter 4 Proposed Timestep-Grouping Quantization .......................................................................................... 33
4.1 Challenges of Multiple-Timestep Issues .............................................................................................. 33
4.2 Uniform Grouping .................................................................................................................................................................. 34
4.2.1 Algorithm of Uniform Grouping .................................................................................................................................... 34
4.2.2 Experimental Results ........................................................................................................................................................................ 35
4.3 K-Means-Based Grouping .......................................................................................................................................... 35
4.3.1 Algorithm of K-Means-Based Grouping ............................................................................................................ 35
4.3.2 Experimental Results ........................................................................................................................................................................ 36
4.3.3 Drawbacks of K-Means-Based Method .............................................................................................................. 38
4.4 Summary .................................................................................................................................................................................................. 39
Chapter 5 Similarity-based Grouping Method .................................................................................................................. 41
5.1 Proposed Similarity-based Grouping ...................................................................................................... 41
5.1.1 Observations from Separately Quantizing Different Layers ................................................ 41
xiii
5.1.2 Algorithm of Similarity-Based Grouping .......................................................................................................... 44
5.1.3 Experimental Results ........................................................................................................................................................................ 47
5.1.4 Hardware Mapping of Similarity-Based Grouping ............................................................................ 54
5.2 Ablation Study ................................................................................................................................................................................ 55
5.3 Differences in Calibration Steps Between General Models and Diffusion Models ........................................................................................................................................................................ 56
5.4 Systematic Calibration Method ........................................................................................................................ 57
5.4.1 Proposed Magnitude-aware Calibration Method .................................................................................... 58
5.4.2 Experimental Results ........................................................................................................................................................................ 58
5.5 Summary .................................................................................................................................................................................................. 59
Chapter 6 Contributions and Future Works ............................................................................................................................ 61
6.1 Contributions.................................................................................................................................................................................... 61
6.2 Future Works .................................................................................................................................................................................... 62
Reference .................................................................................................................................................................................................................................................. 65
-
dc.language.isoen-
dc.subject模型壓縮zh_TW
dc.subject擴散模型zh_TW
dc.subject後訓練量化zh_TW
dc.subjectpost-training quantizationen
dc.subjectDiffusion modelen
dc.subjectmodel compressionen
dc.title基於時間步長分群技術之擴散模型後量化技術zh_TW
dc.titlePost-Training Quantization for Diffusion Model Based on Timestep-Grouping Methoden
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee盧奕璋;沈中安;馬咏治zh_TW
dc.contributor.oralexamcommitteeYi-Chang Lu;Chung-An Shen;Win-Ken Behen
dc.subject.keyword擴散模型,模型壓縮,後訓練量化,zh_TW
dc.subject.keywordDiffusion model,model compression,post-training quantization,en
dc.relation.page68-
dc.identifier.doi10.6342/NTU202403502-
dc.rights.note未授權-
dc.date.accepted2024-08-08-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電子工程學研究所-
顯示於系所單位:電子工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
  未授權公開取用
4.59 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved