基於擴散模型之規避攻擊

施君諺; Chun-Yen Shih

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94304

Title:	基於擴散模型之規避攻擊 Evasion Attack on Diffusion Model
Authors:	施君諺 Chun-Yen Shih
Advisor:	周承復 Cheng-Fu Chou
Keyword:	規避攻擊,生成式人工智慧,擴散模型,惡意攻擊,人工智慧安全, Evasion Attack,Generative AI,Diffusion Model,Adversarial Attack,AI Security,
Publication Year :	2024
Degree:	碩士
Abstract:	有鑒於擴散模型（Diffusion Model）對於圖像生成品質的大幅提升，圖像編輯行為淺在著侵犯藝術家、圖片擁有者之版權及著作權之風險。為了防止此問題，一個較有效且可行之方法為限制模型對預期輸出圖片之存取。近年來，許多研究團隊已經在應對此問題上有顯著之進展，特別是對於潛在擴散模型（Latent Diffusion Model）所生成之對抗圖像，然而這些方法尚未有效地轉移到基於像素的擴散模型（Pixel-based Diffusion Model），而使得基於像素的擴散模型成為漏洞。在這篇論文中，我們設計了一個基於最佳化的流程，此流程可以對潛在擴散模型及基於像素的擴散模型達到攻擊效果。我們的方法中包含一種新的特徵損失，該損失最大化了ResNet區塊內潛在表示（Latent）的距離，從而提高了攻擊的效果。此外，我們還利用變分自編碼器（VAE）將圖像轉換到潛在空間。這種轉換使我們能夠在潛在空間進行最佳化，並隨後解碼對抗圖像之潛在表示。同時，為了確保對抗圖像的品質，我們應用了分割遮罩（Segmentation mask），將解碼後之對抗圖像潛在表示與源圖像進行混合。這一過程產生了一個與源圖像相似之對抗圖像，同時也可以有效干擾基於擴散模型的編輯行為。最後，我們進行了大量實驗，徹底評估了我們方法的有效性，展示其能夠生成隱密的的對抗圖像，並干擾基於擴散模型的編輯行為。 Due to the superior quality of images generated by diffusion models (DM), image editing poses potential risks of infringing upon painters' copyrights. A proactive approach to mitigate this issue is to restrict access to the intended output image. Recent advancements have achieved success in generating adversarial images for Latent Diffusion Models (LDM) to address such concerns. However, these methods have not been effectively transferred to Pixel-based Diffusion Models (PDM). In this work, we introduce an optimization-based pipeline capable of attacking both LDM and PDM. Our approach includes a novel feature loss that maximizes the distance of latents within the ResNet block, thereby enhancing the effectiveness of the attack. Additionally, we leverage a Variational Autoencoder (VAE) to transform images into latent space. This transformation allows us to optimize the latent representation and subsequently decode it. To ensure the quality of the adversarial image, we apply a segmentation mask that blends the decoded optimized latent with the source image. This process results in a seamless adversarial image that effectively disrupts the diffusion-based editing process. Lastly, we conducted extensive experiments to thoroughly evaluate the effectiveness of our method, demonstrating its capability to generate high-quality adversarial examples that disrupt diffusion-based editing processes.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94304
DOI:	10.6342/NTU202402243
Fulltext Rights:	同意授權(全球公開)
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-112-2.pdf	20.3 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets