Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99291
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor陳尚澤zh_TW
dc.contributor.advisorShang-Tse Chenen
dc.contributor.author李華健zh_TW
dc.contributor.authorWa Kin Leien
dc.date.accessioned2025-08-21T17:09:00Z-
dc.date.available2025-08-22-
dc.date.copyright2025-08-21-
dc.date.issued2025-
dc.date.submitted2025-08-03-
dc.identifier.citation[1] A. Bansal, H.-M. Chu, A. Schwarzschild, S. Sengupta, M. Goldblum, J. Geiping, and T. Goldstein. Universal guidance for diffusion models. In International Conference on Learning Representations (ICLR), 2024.
[2] M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin. Emerging properties in self-supervised vision transformers. In International Conference on Computer Vision (ICCV), 2021.
[3] D. Chen, S. Li, Y. Zhang, C. Li, S. Kundu, and P. A. Beerel. DIA: Diffusion based inverse network attack on collaborative inference. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[4] J. Chen, D. Zhu, X. Shen, X. Li, Z. Liu, P. Zhang, R. Krishnamoorthi, V. Chandra, Y. Xiong, and M. Elhoseiny. MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning. arXiv preprint arXiv:2310.09478, 2023.
[5] H. Chung, J. Kim, M. T. Mccann, M. L. Klasky, and J. C. Ye. Diffusion posterior sampling for general noisy inverse problems. In International Conference on Learning Representation (ICLR), 2023.
[6] T. Darcet, M. Oquab, J. Mairal, and P. Bojanowski. Vision transformers need registers. In International Conference on Learning Representations (ICLR), 2024.
[7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
[8] P. Dhariwal and A. Nichol. Diffusion models beat GANs on image synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
[9] X. Dong, H. Yin, J. M. Álvarez, J. Kautz, P. Molchanov, and H. Kung. Privacy vulnerability of split computing to data-free model inversion attacks. In 33rd British Machine Vision Conference (BMVC), 2022.
[10] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), 2021.
[11] E. Erdoğan, A. Küpçü, and A. E. Çiçek. Unsplit: Data-oblivious model inversion, model stealing, and label inference attacks against split learning. In 21st Workshop on Privacy in the Electronic Society, 2022.
[12] X. Gao and L. Zhang. PCAT: Functionality and data stealing from split learning by pseudo-client attack. In 32nd USENIX Security Symposium (USENIX Security 23), 2023.
[13] A. Hatamizadeh, H. Yin, H. R. Roth, W. Li, J. Kautz, D. Xu, and P. Molchanov. GradViT: Gradient inversion of vision transformers. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[14] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick. Masked autoencoders are scalable vision learners. In IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022.
[15] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2016.
[16] Z. He, T. Zhang, and R. B. Lee. Model inversion attacks against collaborative inference. In Annual Computer Security Applications Conference (ACSAC), 2019.
[17] J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems (NeurIPS), 2020.
[18] J. Ho and T. Salimans. Classifier-free diffusion guidance. In NeurIPS Workshop on Deep Generative Models and Downstream Applications, 2021.
[19] A. Horé and D. Ziou. Image quality metrics: PSNR vs. SSIM. In International Conference on Pattern Recognition (ICPR), 2010.
[20] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
[21] T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila. Training generative adversarial networks with limited data. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
[22] T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[23] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. Analyzing and improving the image quality of StyleGAN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[24] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
[25] H. W. Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2), 1955.
[26] A. Li, J. Guo, H. Yang, F. D. Salim, and Y. Chen. Deepobfuscator: Obfuscating intermediate representations with privacy-preserving adversarial learning on smartphones. In International Conference on Internet-of-Things Design and Implementation (IoTDI), 2021.
[27] Z. Li, M. Yang, Y. Liu, J. Wang, H. Hu, W. Yi, and X. Xu. GAN you see me? Enhanced data reconstruction attacks against split inference. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
[28] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (ECCV), 2014.
[29] H. Liu, C. Li, Q. Wu, and Y. J. Lee. Visual instruction tuning. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
[30] Y. Matsubara, M. Levorato, and F. Restuccia. Split computing and early exiting for deep learning applications: Survey and research challenges. ACM Computing Surveys, 2022.
[31] F. Mireshghallah, M. Taram, P. Ramrakhyani, A. Jalali, D. Tullsen, and H. Esmaeilzadeh. Shredder: Learning noise distributions to protect inference privacy. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2020.
[32] R. Mokady, A. Hertz, and A. H. Bermano. Clipcap: Clip prefix for image captioning. arXiv preprint arXiv:2111.09734, 2021.
[33] S. Ohta and T. Nishio. Λ-split: A privacy-preserving split computing framework for cloud-powered generative ai. arXiv preprint arXiv:2310.14651, 2023.
[34] M. Oquab, T. Darcet, T. Moutakanni, H. V. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. HAZIZA, F. Massa, A. El-Nouby, et al. Dinov2: Learning robust visual features without supervision. Transactions on Machine Learning Research (TMLR), 2024.
[35] S. A. Osia, A. S. Shamsabadi, S. Sajadmanesh, A. Taheri, K. Katevas, H. R. Rabiee, N. D. Lane, and H. Haddadi. A hybrid deep learning architecture for privacy-preserving mobile analytics. IEEE Internet of Things Journal (IoT-J), 2020.
[36] D. Pasquini, G. Ateniese, and M. Bernaschi. Unleashing the tiger: Inference attacks on split learning. In ACM SIGSAC Conference on Computer and Communications Security, 2021.
[37] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML), 2021.
[38] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
[39] Y. Rao, W. Zhao, G. Chen, Y. Tang, Z. Zhu, G. Huang, J. Zhou, and J. Lu. DenseCLIP: Language-guided dense prediction with context-aware prompting. In IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022.
[40] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[41] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 1992.
[42] C.-C. Sa, L.-C. Cheng, H.-H. Chung, T.-C. Chiu, C.-Y. Wang, A.-C. Pang, and S.-T. Chen. Ensuring bidirectional privacy on wireless split inference systems. IEEE Wireless Communications, 2024.
[43] A. Sauer, K. Schwarz, and A. Geiger. StyleGAN-XL: Scaling StyleGAN to large diverse datasets. In ACM SIGGRAPH conference proceedings, 2022.
[44] A. Singh, A. Chopra, V. Sharma, E. Garza, E. Zhang, P. Vepakomma, and R. Raskar. DISCO: Dynamic and invariant sensitive channel obfuscation for deep neural networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021.
[45] J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations (ICLR), 2020.
[46] T. Titcombe, A. J. Hall, P. Papadopoulos, and D. Romanini. Practical defences against model inversion attacks for split neural networks. arXiv preprint arXiv:2104.05743, 2021.
[47] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Deep image prior. In IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[48] P. Vepakomma, A. Singh, O. Gupta, and R. Raskar. NoPeek: Information leakage reduction to share activations in distributed deep learning. In International Conference on Data Mining Workshops (ICDMW), 2020.
[49] Z. Wang, E. P. Simoncelli, and A. C. Bovik. Multiscale structural similarity for image quality assessment. In Asilomar Conference on Signals, Systems & Computers, 2003.
[50] X. Xu, M. Yang, W. Yi, Z. Li, J. Wang, H. Hu, Y. Zhuang, and Y. Liu. A stealthy wrongdoer: Feature-oriented reconstruction attack against split learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[51] L. Yang, S. Ding, Y. Cai, J. Yu, J. Wang, and Y. Shi. Guidance with spherical gaussian constraint for conditional diffusion. In International Conference on Machine Learning (ICML), 2024.
[52] Y. Yang and S. Newsam. Bag-of-visual-words and spatial extensions for land-use classification. In SIGSPATIAL international conference on advances in geographic information systems, 2010.
[53] H. Yin, A. Vahdat, J. M. Alvarez, A. Mallya, J. Kautz, and P. Molchanov. AdaViT: Adaptive tokens for efficient vision transformer. In IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[54] F. Yu, Y. Zhang, S. Song, A. Seff, and J. Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
[55] L. Zhang, A. Rao, and M. Agrawala. Adding conditional control to text-to-image diffusion models. In IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
[56] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99291-
dc.description.abstract隨著大型基石模型的興起,協同推理(Split Inference, SI)成為一種在輕量化邊緣設備與雲端伺服器間部署模型的熱門計算範式,有效解決了資料隱私與計算成本的問題。然而,目前大多數的資料重構攻擊主要集中在較小的卷積神經網路分類模型上,對於協同推理情境下基礎模型的隱私風險研究仍然有限。為了彌補此空白,我們提出了一種基於引導式擴散(Guided Diffusion)的新型資料重建攻擊方法,此方法利用了在大規模數據集上預訓練的潛空間擴散模型(Latent Diffusion Model, LDM)中嵌入的豐富先驗知識。我們的方法在潛空間擴散模型學習的影像先驗上執行迭代重建,能夠有效從中間表示(Intermediate Representations, IR)生成與原始資料高度相似的高保真影像。從大量實驗表明,我們的方法對比與現有最先進的方法,在從視覺基礎模型深層中間表示中重建數據的質量具有顯著優勢。這些結果強調了在協同推理情境下,為大型模型提供更強隱私保護機制的緊迫性。zh_TW
dc.description.abstractWith the rise of large foundation models, split inference (SI) has emerged as a popular computational paradigm for deploying models across lightweight edge devices and cloud servers, addressing data privacy and computational cost concerns. However, most existing data reconstruction attacks have focused on smaller CNN classification models, leaving the privacy risks of foundation models in SI settings largely unexplored. To address this gap, we propose a novel data reconstruction attack based on guided diffusion, which leverages the rich prior knowledge embedded in a latent diffusion model (LDM) pre-trained on a large-scale dataset. Our method performs iterative reconstruction on the LDM’s learned image prior, effectively generating high-fidelity images resembling the original data from their intermediate representations (IR). Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods, both qualitatively and quantitatively, in reconstructing data from deep-layer IRs of the vision foundation model. The results highlight the urgent need for more robust privacy protection mechanisms for large models in SI scenarios.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T17:09:00Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-08-21T17:09:00Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements iii
摘要 v
Abstract vii
Contents ix
List of Figures xiii
List of Tables xv
Denotation xvii
Chapter 1 Introduction 1
Chapter 2 Related Work 3
2.1 Split Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Data Reconstruction Attack (DRA) . . . . . . . . . . . . . . . . 4
2.3 Diffusion Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 3 Methodology 7
3.1 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Data Reconstruction Attack using Guided Diffusion . . . . . . . . 8
3.3 Extending DRAG with Inverse Networks . . . . . . . . . . . . . . 10
3.4 Adapting to Latent Diffusion Models . . . . . . . . . . . . . . . . 11
Chapter 4 Experimental Setups 13
4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Target Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Baseline and Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4 Attacker Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.5 Distance Function and Regularization . . . . . . . . . . . . . . . 15
Chapter 5 Experimental Result 17
5.1 Attacking Frozen Foundation Models . . . . . . . . . . . . . . . . 17
5.2 Enhancing DRAG with Inverse Networks . . . . . . . . . . . . . . 19
5.3 Distribution Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.4 Attacking Privacy-Guarded Models . . . . . . . . . . . . . . . . . 21
5.5 Token Shuffling (and Dropping) Defense . . . . . . . . . . . . . . 23
Chapter 6 Conclusion 27
References 29
Appendix A — Detailed Report 37
A.1 Attacking Frozen Foundation Models . . . . . . . . . . . . . . . . 37
A.2 Attacking Privacy-Guarded Models . . . . . . . . . . . . . . . . . 37
A.3 Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Appendix B — Extended Experiments 43
B.1 Scaling Reconstruction Schedule . . . . . . . . . . . . . . . . . . 43
B.2 Guidance Strength w . . . . . . . . . . . . . . . . . . . . . . . . . 43
B.3 Importance of the Optimizer . . . . . . . . . . . . . . . . . . . . 44
Appendix C — Baseline Attacks 45
Appendix D — Defensive Algorithms 47
D.1 Privacy Leakage Mitigation Methods . . . . . . . . . . . . . . . . 47
D.2 Implementation of Defense Mechanisms . . . . . . . . . . . . . . 48
Appendix E — Implementation Details 49
-
dc.language.isoen-
dc.subject隱私zh_TW
dc.subject資料逆推攻擊zh_TW
dc.subject擴散模型zh_TW
dc.subject攻擊zh_TW
dc.subjectattacksen
dc.subjectdiffusion modelsen
dc.subjectprivacyen
dc.subjectdata reconstruction attacksen
dc.title利用擴散模型進行真實資料逆推攻擊zh_TW
dc.titleTowards Real-World Data Reconstruction Attacks using Diffusion Prioren
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陳駿丞;林守德;王紹睿zh_TW
dc.contributor.oralexamcommitteeJun-Cheng Chen;Shou-De Lin;Shao-Jui Wangen
dc.subject.keyword資料逆推攻擊,隱私,攻擊,擴散模型,zh_TW
dc.subject.keyworddata reconstruction attacks,privacy,attacks,diffusion models,en
dc.relation.page50-
dc.identifier.doi10.6342/NTU202502937-
dc.rights.note未授權-
dc.date.accepted2025-08-06-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
dc.date.embargo-liftN/A-
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-113-2.pdf
  Restricted Access
18.11 MBAdobe PDF
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved