利用擴散模型進行真實資料逆推攻擊

李華健; Wa Kin Lei

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99291

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	陳尚澤	zh_TW
dc.contributor.advisor	Shang-Tse Chen	en
dc.contributor.author	李華健	zh_TW
dc.contributor.author	Wa Kin Lei	en
dc.date.accessioned	2025-08-21T17:09:00Z	-
dc.date.available	2025-08-22	-
dc.date.copyright	2025-08-21	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-03	-
dc.identifier.citation	[1] A. Bansal, H.-M. Chu, A. Schwarzschild, S. Sengupta, M. Goldblum, J. Geiping, and T. Goldstein. Universal guidance for diffusion models. In International Conference on Learning Representations (ICLR), 2024. [2] M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin. Emerging properties in self-supervised vision transformers. In International Conference on Computer Vision (ICCV), 2021. [3] D. Chen, S. Li, Y. Zhang, C. Li, S. Kundu, and P. A. Beerel. DIA: Diffusion based inverse network attack on collaborative inference. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. [4] J. Chen, D. Zhu, X. Shen, X. Li, Z. Liu, P. Zhang, R. Krishnamoorthi, V. Chandra, Y. Xiong, and M. Elhoseiny. MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning. arXiv preprint arXiv:2310.09478, 2023. [5] H. Chung, J. Kim, M. T. Mccann, M. L. Klasky, and J. C. Ye. Diffusion posterior sampling for general noisy inverse problems. In International Conference on Learning Representation (ICLR), 2023. [6] T. Darcet, M. Oquab, J. Mairal, and P. Bojanowski. Vision transformers need registers. In International Conference on Learning Representations (ICLR), 2024. [7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2009. [8] P. Dhariwal and A. Nichol. Diffusion models beat GANs on image synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2021. [9] X. Dong, H. Yin, J. M. Álvarez, J. Kautz, P. Molchanov, and H. Kung. Privacy vulnerability of split computing to data-free model inversion attacks. In 33rd British Machine Vision Conference (BMVC), 2022. [10] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), 2021. [11] E. Erdoğan, A. Küpçü, and A. E. Çiçek. Unsplit: Data-oblivious model inversion, model stealing, and label inference attacks against split learning. In 21st Workshop on Privacy in the Electronic Society, 2022. [12] X. Gao and L. Zhang. PCAT: Functionality and data stealing from split learning by pseudo-client attack. In 32nd USENIX Security Symposium (USENIX Security 23), 2023. [13] A. Hatamizadeh, H. Yin, H. R. Roth, W. Li, J. Kautz, D. Xu, and P. Molchanov. GradViT: Gradient inversion of vision transformers. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [14] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick. Masked autoencoders are scalable vision learners. In IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022. [15] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2016. [16] Z. He, T. Zhang, and R. B. Lee. Model inversion attacks against collaborative inference. In Annual Computer Security Applications Conference (ACSAC), 2019. [17] J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems (NeurIPS), 2020. [18] J. Ho and T. Salimans. Classifier-free diffusion guidance. In NeurIPS Workshop on Deep Generative Models and Downstream Applications, 2021. [19] A. Horé and D. Ziou. Image quality metrics: PSNR vs. SSIM. In International Conference on Pattern Recognition (ICPR), 2010. [20] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017. [21] T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila. Training generative adversarial networks with limited data. In Advances in Neural Information Processing Systems (NeurIPS), 2020. [22] T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [23] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. Analyzing and improving the image quality of StyleGAN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [24] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015. [25] H. W. Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2), 1955. [26] A. Li, J. Guo, H. Yang, F. D. Salim, and Y. Chen. Deepobfuscator: Obfuscating intermediate representations with privacy-preserving adversarial learning on smartphones. In International Conference on Internet-of-Things Design and Implementation (IoTDI), 2021. [27] Z. Li, M. Yang, Y. Liu, J. Wang, H. Hu, W. Yi, and X. Xu. GAN you see me? Enhanced data reconstruction attacks against split inference. In Advances in Neural Information Processing Systems (NeurIPS), 2023. [28] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (ECCV), 2014. [29] H. Liu, C. Li, Q. Wu, and Y. J. Lee. Visual instruction tuning. In Advances in Neural Information Processing Systems (NeurIPS), 2023. [30] Y. Matsubara, M. Levorato, and F. Restuccia. Split computing and early exiting for deep learning applications: Survey and research challenges. ACM Computing Surveys, 2022. [31] F. Mireshghallah, M. Taram, P. Ramrakhyani, A. Jalali, D. Tullsen, and H. Esmaeilzadeh. Shredder: Learning noise distributions to protect inference privacy. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2020. [32] R. Mokady, A. Hertz, and A. H. Bermano. Clipcap: Clip prefix for image captioning. arXiv preprint arXiv:2111.09734, 2021. [33] S. Ohta and T. Nishio. Λ-split: A privacy-preserving split computing framework for cloud-powered generative ai. arXiv preprint arXiv:2310.14651, 2023. [34] M. Oquab, T. Darcet, T. Moutakanni, H. V. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. HAZIZA, F. Massa, A. El-Nouby, et al. Dinov2: Learning robust visual features without supervision. Transactions on Machine Learning Research (TMLR), 2024. [35] S. A. Osia, A. S. Shamsabadi, S. Sajadmanesh, A. Taheri, K. Katevas, H. R. Rabiee, N. D. Lane, and H. Haddadi. A hybrid deep learning architecture for privacy-preserving mobile analytics. IEEE Internet of Things Journal (IoT-J), 2020. [36] D. Pasquini, G. Ateniese, and M. Bernaschi. Unleashing the tiger: Inference attacks on split learning. In ACM SIGSAC Conference on Computer and Communications Security, 2021. [37] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML), 2021. [38] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022. [39] Y. Rao, W. Zhao, G. Chen, Y. Tang, Z. Zhu, G. Huang, J. Zhou, and J. Lu. DenseCLIP: Language-guided dense prediction with context-aware prompting. In IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022. [40] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [41] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 1992. [42] C.-C. Sa, L.-C. Cheng, H.-H. Chung, T.-C. Chiu, C.-Y. Wang, A.-C. Pang, and S.-T. Chen. Ensuring bidirectional privacy on wireless split inference systems. IEEE Wireless Communications, 2024. [43] A. Sauer, K. Schwarz, and A. Geiger. StyleGAN-XL: Scaling StyleGAN to large diverse datasets. In ACM SIGGRAPH conference proceedings, 2022. [44] A. Singh, A. Chopra, V. Sharma, E. Garza, E. Zhang, P. Vepakomma, and R. Raskar. DISCO: Dynamic and invariant sensitive channel obfuscation for deep neural networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021. [45] J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations (ICLR), 2020. [46] T. Titcombe, A. J. Hall, P. Papadopoulos, and D. Romanini. Practical defences against model inversion attacks for split neural networks. arXiv preprint arXiv:2104.05743, 2021. [47] D. Ulyanov, A. Vedaldi, and V. Lempitsky. Deep image prior. In IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR), 2018. [48] P. Vepakomma, A. Singh, O. Gupta, and R. Raskar. NoPeek: Information leakage reduction to share activations in distributed deep learning. In International Conference on Data Mining Workshops (ICDMW), 2020. [49] Z. Wang, E. P. Simoncelli, and A. C. Bovik. Multiscale structural similarity for image quality assessment. In Asilomar Conference on Signals, Systems & Computers, 2003. [50] X. Xu, M. Yang, W. Yi, Z. Li, J. Wang, H. Hu, Y. Zhuang, and Y. Liu. A stealthy wrongdoer: Feature-oriented reconstruction attack against split learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. [51] L. Yang, S. Ding, Y. Cai, J. Yu, J. Wang, and Y. Shi. Guidance with spherical gaussian constraint for conditional diffusion. In International Conference on Machine Learning (ICML), 2024. [52] Y. Yang and S. Newsam. Bag-of-visual-words and spatial extensions for land-use classification. In SIGSPATIAL international conference on advances in geographic information systems, 2010. [53] H. Yin, A. Vahdat, J. M. Alvarez, A. Mallya, J. Kautz, and P. Molchanov. AdaViT: Adaptive tokens for eﬀicient vision transformer. In IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR), 2022. [54] F. Yu, Y. Zhang, S. Song, A. Seff, and J. Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015. [55] L. Zhang, A. Rao, and M. Agrawala. Adding conditional control to text-to-image diffusion models. In IEEE/CVF International Conference on Computer Vision (ICCV), 2023. [56] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99291	-
dc.description.abstract	隨著大型基石模型的興起，協同推理（Split Inference, SI）成為一種在輕量化邊緣設備與雲端伺服器間部署模型的熱門計算範式，有效解決了資料隱私與計算成本的問題。然而，目前大多數的資料重構攻擊主要集中在較小的卷積神經網路分類模型上，對於協同推理情境下基礎模型的隱私風險研究仍然有限。為了彌補此空白，我們提出了一種基於引導式擴散（Guided Diffusion）的新型資料重建攻擊方法，此方法利用了在大規模數據集上預訓練的潛空間擴散模型（Latent Diffusion Model, LDM）中嵌入的豐富先驗知識。我們的方法在潛空間擴散模型學習的影像先驗上執行迭代重建，能夠有效從中間表示（Intermediate Representations, IR）生成與原始資料高度相似的高保真影像。從大量實驗表明，我們的方法對比與現有最先進的方法，在從視覺基礎模型深層中間表示中重建數據的質量具有顯著優勢。這些結果強調了在協同推理情境下，為大型模型提供更強隱私保護機制的緊迫性。	zh_TW
dc.description.abstract	With the rise of large foundation models, split inference (SI) has emerged as a popular computational paradigm for deploying models across lightweight edge devices and cloud servers, addressing data privacy and computational cost concerns. However, most existing data reconstruction attacks have focused on smaller CNN classification models, leaving the privacy risks of foundation models in SI settings largely unexplored. To address this gap, we propose a novel data reconstruction attack based on guided diffusion, which leverages the rich prior knowledge embedded in a latent diffusion model (LDM) pre-trained on a large-scale dataset. Our method performs iterative reconstruction on the LDM’s learned image prior, effectively generating high-fidelity images resembling the original data from their intermediate representations (IR). Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods, both qualitatively and quantitatively, in reconstructing data from deep-layer IRs of the vision foundation model. The results highlight the urgent need for more robust privacy protection mechanisms for large models in SI scenarios.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T17:09:00Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-21T17:09:00Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xv Denotation xvii Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Split Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Data Reconstruction Attack (DRA) . . . . . . . . . . . . . . . . 4 2.3 Diffusion Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 3 Methodology 7 3.1 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Data Reconstruction Attack using Guided Diffusion . . . . . . . . 8 3.3 Extending DRAG with Inverse Networks . . . . . . . . . . . . . . 10 3.4 Adapting to Latent Diffusion Models . . . . . . . . . . . . . . . . 11 Chapter 4 Experimental Setups 13 4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 Target Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 Baseline and Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.4 Attacker Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.5 Distance Function and Regularization . . . . . . . . . . . . . . . 15 Chapter 5 Experimental Result 17 5.1 Attacking Frozen Foundation Models . . . . . . . . . . . . . . . . 17 5.2 Enhancing DRAG with Inverse Networks . . . . . . . . . . . . . . 19 5.3 Distribution Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.4 Attacking Privacy-Guarded Models . . . . . . . . . . . . . . . . . 21 5.5 Token Shuffling (and Dropping) Defense . . . . . . . . . . . . . . 23 Chapter 6 Conclusion 27 References 29 Appendix A — Detailed Report 37 A.1 Attacking Frozen Foundation Models . . . . . . . . . . . . . . . . 37 A.2 Attacking Privacy-Guarded Models . . . . . . . . . . . . . . . . . 37 A.3 Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Appendix B — Extended Experiments 43 B.1 Scaling Reconstruction Schedule . . . . . . . . . . . . . . . . . . 43 B.2 Guidance Strength w . . . . . . . . . . . . . . . . . . . . . . . . . 43 B.3 Importance of the Optimizer . . . . . . . . . . . . . . . . . . . . 44 Appendix C — Baseline Attacks 45 Appendix D — Defensive Algorithms 47 D.1 Privacy Leakage Mitigation Methods . . . . . . . . . . . . . . . . 47 D.2 Implementation of Defense Mechanisms . . . . . . . . . . . . . . 48 Appendix E — Implementation Details 49	-
dc.language.iso	en	-
dc.subject	隱私	zh_TW
dc.subject	資料逆推攻擊	zh_TW
dc.subject	擴散模型	zh_TW
dc.subject	攻擊	zh_TW
dc.subject	attacks	en
dc.subject	diffusion models	en
dc.subject	privacy	en
dc.subject	data reconstruction attacks	en
dc.title	利用擴散模型進行真實資料逆推攻擊	zh_TW
dc.title	Towards Real-World Data Reconstruction Attacks using Diffusion Prior	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳駿丞;林守德;王紹睿	zh_TW
dc.contributor.oralexamcommittee	Jun-Cheng Chen;Shou-De Lin;Shao-Jui Wang	en
dc.subject.keyword	資料逆推攻擊,隱私,攻擊,擴散模型,	zh_TW
dc.subject.keyword	data reconstruction attacks,privacy,attacks,diffusion models,	en
dc.relation.page	50	-
dc.identifier.doi	10.6342/NTU202502937	-
dc.rights.note	未授權	-
dc.date.accepted	2025-08-06	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	N/A	-
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-113-2.pdf Restricted Access	18.11 MB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets