Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97155
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳銘憲zh_TW
dc.contributor.advisorMing-Syan Chenen
dc.contributor.author蔡莉亞zh_TW
dc.contributor.authorLi-Ya Tsaien
dc.date.accessioned2025-02-27T16:26:46Z-
dc.date.available2025-02-28-
dc.date.copyright2025-02-27-
dc.date.issued2025-
dc.date.submitted2025-02-06-
dc.identifier.citationAlexey Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
Lorenzo Papa, Paolo Russo, Irene Amerini, and Luping Zhou. A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015.
Geoffrey Hinton. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342, 2018.
Junrui Xiao, Zhikai Li, Lianwei Yang, and Qingyi Gu. Patch-wise mixed-precision quantization of vision transformer. In 2023 International Joint Conference on Neural Networks (IJCNN), pages 1–7. IEEE, 2023.
Yu-Shan Tai et al. Mptq-vit: Mixed-precision post training quantization for vision transformer. arXiv preprint arXiv:2401.14895, 2024.
Navin Ranjan and Andreas Savakis. Mix-qvit: Mixed-precision vision transformer quantization driven by layer importance and quantization sensitivity. arXiv preprint arXiv:2501.06357, 2025.
Weilin Cai, Juyong Jiang, Fan Wang, Jing Tang, Sunghun Kim, and Jiayi Huang. A survey on mixture of experts. arXiv preprint arXiv:2407.06204, 2024.
Stefanos Laskaridis, Alexandros Kouris, and Nicholas D Lane. Adaptive inference through early-exit networks: Design, challenges and directions. In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning, pages 1–6,2021.
Arian Bakhtiarnia, Qi Zhang, and Alexandros Iosifidis. Multi-exit vision transformer for dynamic inference. arXiv preprint arXiv:2106.15183, 2021.
Guanyu Xu, Jiawei Hao, Li Shen, Han Hu, Yong Luo, Hui Lin, and Jialie Shen. Lgvit: Dynamic early exiting for accelerating vision transformer. In Proceedings of the 31st ACM International Conference on Multimedia, pages 9103–9114, 2023.
Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, and Alexey Dosovitskiy. Do vision transformers see like convolutional neural networks? Advances in neural information processing systems, 34:12116–12128, 2021.
A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017.
Zhengyu He. Deep learning in image classification: A survey report. In 2020 2nd International Conference on Information Technology and Computer Application (ITCA), pages 174–177. IEEE, 2020.
Syed Sahil Abbas Zaidi, Mohammad Samar Ansari, Asra Aslam, Nadia Kanwal, Mamoona Asghar, and Brian Lee. A survey of modern deep learning based object detection models. Digital Signal Processing, 126:103514, 2022.
Irem Ulku and Erdem Akagündüz. A survey on deep learning-based architectures for semantic segmentation on 2d images. Applied Artificial Intelligence, 36(1):2032924,2022.
Utkarsh Saxena and Kaushik Roy. Mcqueen: Mixed precision quantization of early exit networks. In BMVC, pages 511–513, 2023.
Ji-Ye Jeon, Xuan Truong Nguyen, Soojung Ryu, and Hyuk-Jae Lee. Usdn: A unified sample-wise dynamic network with mixed-precision and early-exit. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 646–654, 2024.
Andrew G Howard. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
Zhenhua Liu, Yunhe Wang, Kai Han, Wei Zhang, Siwei Ma, and Wen Gao. Post-training quantization for vision transformer. Advances in Neural Information Pro-cessing Systems, 34:28092–28103, 2021.
Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, and Shuchang Zhou. Fq-vit:Post-training quantization for fully quantized vision transformer. arXiv preprint arXiv:2111.13824, 2021.
Zhikai Li, Junrui Xiao, Lianwei Yang, and Qingyi Gu. Repq-vit: Scale reparameterization for post-training quantization of vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 17227–17236,2023.
Yuzhang Shang, Gaowen Liu, Ramana Kompella, and Yan Yan. Quantized-vit efficient training via fisher matrix regularization. In International Conference on Multimedia Modeling, pages 270–284. Springer, 2024.
Daisuke Miyashita, Edward H Lee, and Boris Murmann. Convolutional neural networks using logarithmic data representation. arXiv preprint arXiv:1603.01025,2016.
Chenglong Huang, Puguang Liu, and Liang Fang. Mxqn: Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks. Applied Intelligence, 51:4561–4574, 2021.
Metod Jazbec, Alexander Timans, Tin Hadži Veljković, Kaspar Sakmann, Dan Zhang, Christian A Naesseth, and Eric Nalisnick. Fast yet safe: Early-exiting with risk control. arXiv preprint arXiv:2405.20915, 2024.
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablay-rolles, and Hervé Jégou. Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021.
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, and Guangyu Sun. Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization. In European conference on computer vision, pages 191–207. Springer, 2022.
Nilesh Prasad Pandey, Markus Nagel, Mart van Baalen, Yin Huang, Chirag Patel, and Tijmen Blankevoort. A practical mixed precision algorithm for post-training quantization. arXiv preprint arXiv:2302.05397, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97155-
dc.description.abstract視覺Transformer(Vision Transformers, ViTs)透過自注意力機制在電腦視覺領域展現出色的表現,但由於其高昂的運算量,實際部署仍面臨相當挑戰。雖然混合精度量化(Mixed-Precision Quantization, MPQ)可降低模型容量,而提早退出(Early Exiting, EE)則能提升推論效率,然而將兩者整合時卻會面臨關鍵難題:一方面,量化雜訊會干擾提早退出的判斷穩定度;另一方面,動態的網路層使用情況會使得位元配置更加複雜。

為因應上述問題,我們提出 MAQEE(Mutual Adaptive Quantization with Early Exiting),提供一個可在量化與提早退出之間建立互惠關係的整合式框架。具體包含以下特色:

提早退出感知的混合精度量化(Early Exiting-Aware MPQ):根據各層的實際使用狀況,動態調整並重新配置量化位元。量化後自我蒸餾(Post-Quantization Self-Distillation):在量化後進行自我知識蒸餾,確保提早退出決策的穩定度。整合 SQNR 的量化風險控制(SQNR-incorporated Quantization-Aware Risk Control):在量化過程中納入信號量化雜訊比(SQNR)指標,強化模型的風險控制能力。透過在 CIFAR-100 與 ImageNet-1K 上進行實驗,我們證實了 MAQEE 在維持 MPQ 與 EE 加速效率的同時,能比單純的 MPQ 基線模型提升最高可達 6% 的分類準確率,展現出 MAQEE 在實際應用中的效能與潛力。
zh_TW
dc.description.abstractVision Transformers (ViTs) excel in computer vision through self-attention mechanisms but face deployment challenges due to high computational demands. While Mixed-Precision Quantization (MPQ) reduces model capacity and Early Exit-ing (EE) improves inference efficiency, their integration introduces critical challenges: quantization noise destabilizes exit decisions, while dynamic layer usage complicates bit allocation. We propose Mutual Adaptive Quantization with Early Exiting (MAQEE), a unified framework enabling mutual synergy between quantization and early exiting. Our approach features Early Exiting-Aware MPQ with layer utilization-based bit reallocation, Post-Quantization Self-Distillation for early exiting stability, and SQNR-incorporated Quantization-Aware Risk Control. Experiments on CIFAR-100 and ImageNet-1K demonstrate MAQEE’s effectiveness in achieving up to 6% higher classification accuracy compared with MPQ baseline while preserving the acceleration efficiency of both MPQ and EE.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-27T16:26:46Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-02-27T16:26:46Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements i
摘要 ii
Abstract iii
Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Related work 5
2.1 Early Exiting for Vision Transformers 5
2.2 Quantization 6
3 Preliminaries 9
3.1 Model Quantization 9
3.2 Early Exiting 10
4 Mutual Adaptive Quantization with Early Exiting Framework 12
4.1 Early Exiting-Aware Mixed-Precision Quantization 13
4.1.1 Quantization Sensitivity Assessment 14
4.1.2 Utilization Frequency Under Early-Exiting 14
4.1.2.1 Optimization for Bit-Width Allocation 15
4.2 Self-Distillation Recovery 16
4.3 Quantization-Aware Risk Control 18
5 Experiment 20
5.1 Implementation 20
5.2 Baselines 21
5.3 Metrics 22
6 Results and Analysis 23
6.1 Results on CIFAR-100 and ImageNet-1K 24
6.2 Accuracy v.s. Efficiency Trade-off 24
6.3 Ablation Study 26
7 Conclusion 27
References 28
-
dc.language.isoen-
dc.subject混合精度量化zh_TW
dc.subject訓練後量化zh_TW
dc.subject視覺變換器zh_TW
dc.subject早期退出zh_TW
dc.subjectEarly Exitingen
dc.subjectVision Transformers (ViTs)en
dc.subjectPost-Training Quantization (PTQ)en
dc.subjectMixed-Precision Quantizationen
dc.titleMAQEE:互適應量化與提早退出zh_TW
dc.titleMAQEE: Mutual Adaptive Quantization with Early Exitingen
dc.typeThesis-
dc.date.schoolyear113-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee楊得年;吳齊人;帥宏翰zh_TW
dc.contributor.oralexamcommitteeDe-Nian Yang;Chi-Jen Wu;Hong-Han Shuaien
dc.subject.keyword視覺變換器,訓練後量化,混合精度量化,早期退出,zh_TW
dc.subject.keywordVision Transformers (ViTs),Post-Training Quantization (PTQ),Mixed-Precision Quantization,Early Exiting,en
dc.relation.page32-
dc.identifier.doi10.6342/NTU202404797-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2025-02-07-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
dc.date.embargo-lift2025-02-28-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-1.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
884.3 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved