基於時間步長分群技術之擴散模型後量化技術

王婕恩; Chieh-En Wang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94298

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	吳安宇	zh_TW
dc.contributor.advisor	An-Yeu Wu	en
dc.contributor.author	王婕恩	zh_TW
dc.contributor.author	Chieh-En Wang	en
dc.date.accessioned	2024-08-15T16:41:07Z	-
dc.date.available	2024-08-16	-
dc.date.copyright	2024-08-15	-
dc.date.issued	2024	-
dc.date.submitted	2024-08-05	-
dc.identifier.citation	[1] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” in Neural Information Processing Systems, 2020, pp. 6840-6851. [2] A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in International conference on machine learning, PMLR, 2021, pp. 8162-8171. [3] J. Ho et al., “Imagen video: High definition video generation with diffusion models,” arXiv preprint arXiv:2210.02303, 2022. [4] A. Nichol et al., “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021. [5] A. Creswell et al., “Generative adversarial networks: An overview,” in IEEE signal processing magazine, 2018, pp. 53-65. [6] A. Vahdat and J. Kautz, “NVAE: A deep hierarchical variational autoencoder,” in Neural Information Processing Systems, 2020, pp. 19667-19679. [7] H. Li et al., “Srdiff: Single image super-resolution with diffusion probabilistic models,” in Neurocomputing, 2022, pp. 47-59. [8] S. Gao et al., “Implicit diffusion models for continuous super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10021-10030. [9] J. Whang et al., “Deblurring via stochastic refinement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16293-16303. [10] A. Lugmayr et al., “Repaint: Inpainting using denoising diffusion probabilistic models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11461-11471. [11] C. Corneanu, R. Gadde, and A. M. Martinez, “LatentPaint: Image Inpainting in Latent Space With Diffusion Models,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 4334-4343. [12] J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020. [13] C. Lu et al., “DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps,” in Neural Information Processing Systems, 2022, pp. 3656-3667. [14] R. Rombach et al., “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684-10695 [15] L. Liu et al., “Pseudo numerical methods for diffusion models on manifolds,” arXiv preprint arXiv:2202.09778, 2022. [16] T. Salimans and J. Ho, “Progressive distillation for fast sampling of diffusion models,” arXiv preprint arXiv:2202.00512, 2022. [17] F. Bao et al., “Analytic-DPM: An analytic estimate of the optimal reverse variance in diffusion probabilistic models,” arXiv preprint arXiv:2201.06503, 2022. [18] G. Fang, X. Ma, and X. Wang, “Structural pruning for diffusion models,” in neural information processing systems, 2024, pp. 20-30. [19] G. K. Tam et al., “Diffusion pruning for rapidly and robustly selecting global correspondences using local isometry,” in ACM Trans. Graph., 2014, 4-1. [20] B.Jacob et al., “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2704-2716. [21] Y. Shang et al., “Post-training quantization on diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1972-1981. [22] X. Li et al., “Q-diffusion: Quantizing diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 17535-17545. [23] Y. He et al., “PTQD: Accurate Post-Training Quantization for Diffusion Models,” arXiv preprint arXiv:2305.10657, 2023. [24] J. So et al., “Temporal dynamic quantization for diffusion models,” in Neural Information Processing Systems, 2024, pp. 150-160. [25] Y. Lin et al., “Fq-vit: Post-training quantization for fully quantized vision transformer,” arXiv preprint arXiv:2111.13824, 2021. [26] Z. Yuan et al., “PTQ4ViT: Post-Training Quantization Framework for Vision Transformers with Twin Uniform Quantization,” in European conference on computer vision, 2022, pp. 191-207. [27] Y. S. Tai, M. G. Lin, and A. Y. Wu, “TSPTQ-ViT: Two-scaled post-training quantization for vision transformer,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1-5. [28] J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, “ImageNet: A large-scale hierarchical image database,” in CVPR, 2009, pp. 248-255. [29] F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J Xiao, “Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop,” arXiv preprint arXiv:1506.03365, 2015. [30] M. Heusel et al., “GANs trained by a two time-scale update rule converge to a local nash equilibrium,” in neural information processing systems, 2017, pp. 6626-6637. [31] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, 2022. [32] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10674-10685. [33] C. Saharia et al., “Photorealistic text-to-image diffusion models with deep language understanding,” arXiv preprint arXiv:2205.11487, 2022. [34] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, and D. Amodei, “Language models are few-shot learners,” in Advances in neural information processing systems, 2020, pp. 1877-1901. [35] Y. Guo et al., “Network sketching: Exploiting binary structure in deep CNNs,” in CVPR, 2017, pp. 4056-4065. [36] G. Ulhaq et al., “Efficient diffusion models for vision: A survey,” in arXiv preprint arXiv:2210.09292, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94298	-
dc.description.abstract	近期擴散模型（Diffusion models）因卓越的成像能力，使其在影像生成及編輯等領域獲得了廣泛的認可，並在許多方面超越了傳統模型的方法。儘管它們表現出色，這些模型固有的龐大運算量和記憶體需求往往限制了它們在資源受限的環境以及可攜式裝備的實際應用。為了因應這些挑戰，後訓練量化（ Post-Training Quantization, PTQ）成為了一種有效的解決方案。這個方法可以在不需要重新訓練的情況下減少所需的運算時間以及記憶體空間的需求。然而，由於擴散模型具備多重時間步長multiple-timestep 的特性，傳統的對稱後量化方法無法有效因應此特性往往造成明顯的量化誤差。為了解決這一挑戰，我們提出了一項針對時間步長分群的量化方法，專門設計用於處理擴散模型中多個時間步長的特性，使得不同時間群組之間可以找到更適當的量化因子，進而提升量化的準確度。此外，我們發現SiLU層運算後的激活數值呈現非均勻分布，此分布可能導致劇烈的量化損失。為了減輕這一問題，我們引入了一種區域特定的量化策略，更準確地表示量化後的極值，從而提升模型的整體性能。通過整合這些創新方法，我們成功地開發出一種可應用於整體擴散模型的量化方法。我們的實驗結果表明，該方法在進行8位量化後，仍能有效地保持 Frechet Inception Distance FID得分，突顯了此方法在實際應用中的潛力。	zh_TW
dc.description.abstract	Diffusion models (DMs) have recently garnered widespread acclaim due to their superior imaging capabilities, which have demonstrated significant advancements over traditional methods. Despite their impressive performance, the extensive computational and memory demands inherent in these models often restrict their practical application on portable and resource-constrained devices. To mitigate these challenges, post-training quantization (PTQ) presents a promising solution that facilitates model compression and reduces runtime without necessitating retraining. However, traditional PTQ methods encounter substantial difficulties in addressing the unique time-variant distribution characteristics inherent in diffusion models. To address this challenge, we propose a novel timestep-grouping PTQ approach specifically designed to manage the complexities associated with multiple timesteps in diffusion models. Additionally, we have identified that non-uniform post-SiLU (Sigmoid Linear Unit) activations can result in significant quantization losses. To mitigate this issue, we introduce a region-specific quantization strategy that more accurately represents extreme values following quantization, thereby enhancing the model's overall performance. By integrating these innovative methods, we have successfully developed a fully quantized diffusion model that is feasible for hardware implementation. Our experimental results demonstrate that the proposed approach effectively maintains the Frechet Inception Distance (FID) score even after 8-bit quantization, underscoring the potential of our method for practical applications.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-15T16:41:07Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-08-15T16:41:07Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	CONTENTS 誌謝 v 摘要 vii ABSTRACT ix CONTENTS xi LIST OF FIGURES xv LIST OF TABLES xviii Chapter 1 Introduction 1 1.1 Background 1 1.1.1 Diffusion Model 1 1.1.2 Model Compression 1 1.2 Diffusion Model 3 1.2.1 Fundamental of Diffusion Model 3 1.2.2 Application of Diffusion Model 5 1.2.3 Metric for Generating Images 6 1.3 Motivation and Main Contributions 7 1.4 Thesis Organization 9 Chapter 2 Review of Quantization Approaches 11 2.1 Model Quantization 11 2.1.1 Quantization-Aware Training 11 2.1.2 Post-Training Quantization 12 2.2 Region-Specific Quantization 13 2.2.1 Twin-Uniform Quantization 14 2.2.2 Value-Aware Two-Scaled Scaling Factor 16 xii 2.3 Review of Post-Training Quantization for Diffusion Models .......................... 17 2.3.1 Normally Distributed Timestep Calibration (NDTC) ...................................................................... 18 2.3.2 Shortcut-Splitting Quantization ........................................................................................................................................ 19 2.3.3 Quantization Noise Disentangle ...................................................................................................................................... 20 2.4 Challenges of the Prior Works ............................................................................................................................ 21 2.5 Summary .................................................................................................................................................................................................. 21 Chapter 3 Proposed Timestep-Variant Region-Specific Scaling Factor ...................................... 23 3.1 Challenges of Non-uniform Distributions ........................................................................................ 23 3.2 Proposed Timestep-Variant Region-Specific Scaling Factor .......................... 25 3.3 Experimental Results .......................................................................................................................................................... 27 3.4 Hardware Mapping of TRSF ................................................................................................................................ 29 3.5 Summary .................................................................................................................................................................................................. 30 Chapter 4 Proposed Timestep-Grouping Quantization .......................................................................................... 33 4.1 Challenges of Multiple-Timestep Issues .............................................................................................. 33 4.2 Uniform Grouping .................................................................................................................................................................. 34 4.2.1 Algorithm of Uniform Grouping .................................................................................................................................... 34 4.2.2 Experimental Results ........................................................................................................................................................................ 35 4.3 K-Means-Based Grouping .......................................................................................................................................... 35 4.3.1 Algorithm of K-Means-Based Grouping ............................................................................................................ 35 4.3.2 Experimental Results ........................................................................................................................................................................ 36 4.3.3 Drawbacks of K-Means-Based Method .............................................................................................................. 38 4.4 Summary .................................................................................................................................................................................................. 39 Chapter 5 Similarity-based Grouping Method .................................................................................................................. 41 5.1 Proposed Similarity-based Grouping ...................................................................................................... 41 5.1.1 Observations from Separately Quantizing Different Layers ................................................ 41 xiii 5.1.2 Algorithm of Similarity-Based Grouping .......................................................................................................... 44 5.1.3 Experimental Results ........................................................................................................................................................................ 47 5.1.4 Hardware Mapping of Similarity-Based Grouping ............................................................................ 54 5.2 Ablation Study ................................................................................................................................................................................ 55 5.3 Differences in Calibration Steps Between General Models and Diffusion Models ........................................................................................................................................................................ 56 5.4 Systematic Calibration Method ........................................................................................................................ 57 5.4.1 Proposed Magnitude-aware Calibration Method .................................................................................... 58 5.4.2 Experimental Results ........................................................................................................................................................................ 58 5.5 Summary .................................................................................................................................................................................................. 59 Chapter 6 Contributions and Future Works ............................................................................................................................ 61 6.1 Contributions.................................................................................................................................................................................... 61 6.2 Future Works .................................................................................................................................................................................... 62 Reference .................................................................................................................................................................................................................................................. 65	-
dc.language.iso	en	-
dc.subject	模型壓縮	zh_TW
dc.subject	擴散模型	zh_TW
dc.subject	後訓練量化	zh_TW
dc.subject	post-training quantization	en
dc.subject	Diffusion model	en
dc.subject	model compression	en
dc.title	基於時間步長分群技術之擴散模型後量化技術	zh_TW
dc.title	Post-Training Quantization for Diffusion Model Based on Timestep-Grouping Method	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	盧奕璋;沈中安;馬咏治	zh_TW
dc.contributor.oralexamcommittee	Yi-Chang Lu;Chung-An Shen;Win-Ken Beh	en
dc.subject.keyword	擴散模型,模型壓縮,後訓練量化,	zh_TW
dc.subject.keyword	Diffusion model,model compression,post-training quantization,	en
dc.relation.page	68	-
dc.identifier.doi	10.6342/NTU202403502	-
dc.rights.note	未授權	-
dc.date.accepted	2024-08-08	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電子工程學研究所	-
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 未授權公開取用	4.59 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。