Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101472
Title: DualRobust:針對指令驅動影像編輯的雙重穩健浮水印技術
DualRobust: Dual-Robust Watermarking Against Instruction-Driven Image Editing
Authors: 楊旻勳
Min-Shiun Yang
Advisor: 丁建均
Jian-Jiun Ding
Keyword: 數位水印,深度學習對抗訓練圖像編輯版權保護強健性
Digital Watermarking,Dual RobustnessTwo-Stage TrainingCorrelation-Guided OptimizationInstruction-Driven Image EditingBalanced Robustness
Publication Year : 2026
Degree: 碩士
Abstract: 隨著生成式人工智慧技術的快速發展,深度學習模型在圖像生成和編輯領域取得了突破性進展。然而,這些技術的普及也帶來了新的挑戰,特別是在數位內容的版權保護和真實性驗證方面。傳統的水印技術往往無法有效抵抗基於深度學習的圖像編輯攻擊,因此迫切需要開發更加強健的水印嵌入和檢測方法。
本研究提出了一種基於深度學習的對抗性水印技術,該技術能夠在圖像中嵌入不可見的水印,並在經過各種攻擊後仍能成功檢測和提取水印。本方法在現有技術基礎上進行了重要改進,保留了 PIDSG(Perceptual Image Distortion and Semantic Guidance)模組的核心思想,該模組能夠有效平衡水印的圖像品質和強健性(特別是基於指令的圖像編輯)。
本方法的核心創新在於:
首先,我們採用了兩階段對抗訓練策略,第一階段專注於水印嵌入的視覺品質,確保水印的不可見性;第二階段引入多樣化的噪聲攻擊模擬,提升模型對各種攻擊的強健性。這種分階段的方法能夠有效平衡水印的隱蔽性和抗攻擊能力。整個流程完全可微分,支持端到端的梯度優化,使得水印嵌入和檢測能夠聯合訓練。
其次,我們使用了Optuna_adversarial_search的最佳化方式,使用貝葉斯優化(Optuna)自動搜尋最佳的超參數組合,採用多目標優化策略,同時考慮強健性、視覺品質和檢測準確性,並整合了 JPEG 壓縮、高斯噪聲、旋轉變換、裁切、模糊等多種攻擊模擬,可以有效平衡模型面對各種空間域攻擊時的強健性。
實驗結果表明,本方法在保持高視覺品質的同時,能夠有效抵抗多種攻擊,平均位元錯誤率控制在 15% 以下。特別是在對抗基於指令的圖像編輯攻擊時,本方法展現出優異的強健性,為數位內容的版權保護提供了有效的技術解決方案。
本研究的貢獻主要包括:(1)引入了創新的兩階段對抗訓練策略與原有的,有效平衡了水印的不可見性和強健性;(2)建立了全面的攻擊對抗機制,包括針對現代 AI 圖像編輯技術與傳統的空間域攻擊;(4)引入超參數優化系統,提升了方法的實用性和可擴展性;(5)保留了 PIDSG 模組的核心優勢,並在此基礎上進行了重要的技術改進。

本研究為數位內容保護領域提供了重要的技術突破,具有廣闊的應用前景,包括數位版權保護、內容認證、溯源追蹤等領域。未來工作將進一步擴展到視頻水印、3D 內容保護等更廣泛的應用場景。
With the rapid development of generative artificial intelligence, instruction-driven image editing based on diffusion models has revolutionized content creation. While offering unprecedented flexibility, this advancement complicates copyright protection, making digital watermarking an essential tool for content authentication. In the past, watermarking schemes were primarily designed to resist traditional geometric distortions such as rotation and scaling. Recently, learning-based methods have emerged to tackle semantic-level transformations introduced by AI editing. However, existing approaches typically excel in only one domain, failing to maintain robustness against both semantic edits and geometric attacks simultaneously. We propose a novel watermarking framework that bridges this gap, achieving dual robustness against both attack types while outperforming current standards.
First, we introduce a two-stage training strategy to address the inherent trade-off between image quality and watermark robustness. Unlike conventional single-stage methods, we decouple the learning process: the first stage focuses exclusively on high-fidelity watermark embedding to establish stable patterns, while the second stage incorporates a configurable noise layer to simulate diverse attacks. This separation allows the model to learn resilient features without compromising visual quality, effectively solving the optimization conflicts found in previous works.
Secondly, the overall model architecture integrates a U-Net-based encoder and a ResNet-based decoder, optimized through a unique correlation-guided optimization framework. To manage the computational complexity of training against numerous diverse attacks, we propose an attack correlation analysis matrix. This matrix quantifies the relationships between a small subset of training attacks and a broader range of evaluation attacks. By leveraging these correlations within a Bayesian optimization loop, we enable efficient knowledge transfer, allowing the model to generalize well to unseen distortions without exhaustive training on every specific attack type.
Finally, experimental evaluations on the MagicBrush dataset demonstrate the effectiveness of our approach. Our model maintains a Bit Error Rate (BER) below 0.07 across all test scenarios, effectively handling both instruction-driven editing and traditional geometric perturbations. In comparison to existing state-of-the-art methods, our approach achieves a significantly lower BER standard deviation, providing a more balanced and reliable robustness profile for practical watermarking applications.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101472
DOI: 10.6342/NTU202600177
Fulltext Rights: 未授權
metadata.dc.date.embargo-lift: N/A
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
ntu-114-1.pdf
  Restricted Access
1.79 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved