Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94638
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor吳家麟zh_TW
dc.contributor.advisorJa-Lin Wuen
dc.contributor.author鄭廷瑋zh_TW
dc.contributor.authorTing-Wei Chengen
dc.date.accessioned2024-08-16T17:14:31Z-
dc.date.available2024-08-17-
dc.date.copyright2024-08-16-
dc.date.issued2024-
dc.date.submitted2024-08-13-
dc.identifier.citation[1]M. Brack, F. Friedrich, D. Hintersdorf, L. Struppek, P. Schramowski, and K. Kersting. Sega: Instructing text-to-image models using semantic guidance. In A. Oh,T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advancesin Neural Information Processing Systems, volume 36, pages 25365–25389. CurranAssociates, Inc., 2023.
[2] P. Dhariwal and A. Nichol. Diffusion models beat gans on image synthesis. Advancesin neural information processing systems, 34:8780–8794, 2021.
[3] M. D’Incà, E. Peruzzo, M. Mancini, D. Xu, V. Goel, X. Xu, Z. Wang, H. Shi, andN. Sebe. Openbias: Open-set bias detection in text-to-image generative models.ArXiv, abs/2404.07990, 2024.
[4] R. Gandikota, J. Materzynska, T. Zhou, A. Torralba, and D. Bau. Concept sliders:Lora adaptors for precise control in diffusion models, 2023.
[5] E. Härkönen, A. Hertzmann, J. Lehtinen, and S. Paris. Ganspace: Discoveringinterpretable gan controls. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, andH. Lin, editors, Advances in Neural Information Processing Systems, volume 33,pages 9841–9850. Curran Associates, Inc., 2020.25doi:doi:10.6342/NTU202403591
[6] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen.Lora: Low-rank adaptation of large language models, 2021.
[7] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Dall·e 2: A moderntext-to-image generation model. arXiv preprint arXiv:2204.06125, 2022.
[8] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolutionimage synthesis with latent diffusion models. In Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition, pages 10684–10695,2022.
[9] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolutionimage synthesis with latent diffusion models. In Proceedings of the IEEE/CVFconference on computer vision and pattern recognition, pages 10684–10695, 2022.
[10] Q. Wu, Y. Liu, H. Zhao, A. Kale, T. M. Bui, T. Yu, Z. Lin, Y. Zhang, and S. Chang.Uncovering the disentanglement capability in text-to-image diffusion models. 2023IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages1900–1910, 2022.
[11] L. Zhang, A. Rao, and M. Agrawala. Adding conditional control to text-to-imagediffusion models. In Proceedings of the IEEE/CVF International Conference onComputer Vision, pages 3836–3847, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94638-
dc.description.abstract本論文研究了 CLIP 文本編碼器在條件擴散模型中的適應,以解決語義編輯和去偏見相關的挑戰。我們探討了透過適應 (Adaptation) 在增強生成圖像語義屬性控制方面的有效性,同時減少內在偏見。研究利用了各種解耦策略,並對文字編碼器進行修改,以評估緩解與性別和種族相關偏見的潛力。我們的研究結果表明,通過針對性適應微調文本編碼器可以顯著提高語義控制的精確性和去偏見的有效性。本研究為圖像合成領域的更公平和可控的生成模型的發展做出了貢獻。zh_TW
dc.description.abstractThis thesis investigates the adaptation of the CLIP text encoder for use in conditionaldiffusion models to address challenges related to semantic editing and debiasing. We explore the effectiveness of low-rank adaptations in enhancing the control over semanticattributes of generated images while simultaneously reducing inherent biases. The studyutilizes various disentanglement strategies and introduces modifications to the text encoder to evaluate the potential for mitigating biases related to gender and ethnicity. Ourfindings indicate that fine-tuning the text encoder with targeted adaptations can significantly improve semantic control’s precision and debiasing effectiveness. This work contributes to the development of more fair and controllable generative models in the field ofimage synthesis.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:14:31Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-16T17:14:31Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements I
摘要 III
Abstract V
Contents VII
List of Figures IX
List of Tables XI
Denotation XIII
Chapter 1 Introduction 1
1.1 Research Objective and Questions . . . . . . . . . . . . . . . . . . . 1
1.2 Methodology Overview . . . . . . . . . . . . . . . . . . . . . . . . 2
Chapter 2 Related Works 3
2.1 Disentanglement in Image Generative Models . . . . . . . . . . . . . 3
2.2 Bias in Diffusion Models . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Guidance Based Methods . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Semantic Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 3 Background 7
3.1 Diffusion Models . . . . . . . . . .. . . . . . . . . . . . . . . 7
3.2 Low-Rank Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 4 Method 9
4.1 Concepts Editing . . . . . . . .. . . . . . . . . . . . . . . . . . . 10
4.2 Debiasing . . . . .. . . .. . . . . . . . . . . . . . . . . . . . 11
4.2.1 Gender Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2.2 Ethnic Fairness . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 5 Experiments 15
5.1 Concept Adjustment . . . . . . .. . . . . . . . . . . . . . . . . . 15
5.1.1 Linearly Adjustable . . . . . . . . . . . . . . . . . . . . . . . 15
5.1.2 Multiple Concepts Adjustable . . . . . . . . . . . . . . . . 16
5.1.3 Parameter Efficiency . . . . .. . . . . . . . . . . . . . . . . . . 16
5.2 Debiasing . . . . . . . . . . .. . . . . . . . . . . . . . . . 17
5.2.1 Gender Equality . . . . . .. . . . . . . . . . . . . . . . . . . . 18
5.2.2 Ethnic Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2.3 Training Cost . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Chapter 6 Conclusion and Discussion 21
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
References 25
-
dc.language.isoen-
dc.subject語意編輯zh_TW
dc.subject去偏見zh_TW
dc.subject圖片生成zh_TW
dc.subjectLoRAzh_TW
dc.subjectImage Generativeen
dc.subjectSemantic Editingen
dc.subjectLoRAen
dc.subjectDebiasingen
dc.title透過 CLIP 文字編碼器適應進行條件擴散模型中的語義編輯和去偏見zh_TW
dc.titleSemantic Editing and Debiasing in Conditional Diffusion Models by CLIP Text-Encoder Adaptationen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee許永真;胡敏君 ;陳駿丞;陳文進zh_TW
dc.contributor.oralexamcommitteeYung-Jen Hsu;Min-Chun Hu;Jun-Cheng Chen;Wen-Chin Chenen
dc.subject.keyword語意編輯,去偏見,圖片生成,LoRA,zh_TW
dc.subject.keywordSemantic Editing,Debiasing,Image Generative,LoRA,en
dc.relation.page26-
dc.identifier.doi10.6342/NTU202403591-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-08-14-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf5.69 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved