請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94638完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 吳家麟 | zh_TW |
| dc.contributor.advisor | Ja-Lin Wu | en |
| dc.contributor.author | 鄭廷瑋 | zh_TW |
| dc.contributor.author | Ting-Wei Cheng | en |
| dc.date.accessioned | 2024-08-16T17:14:31Z | - |
| dc.date.available | 2024-08-17 | - |
| dc.date.copyright | 2024-08-16 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-08-13 | - |
| dc.identifier.citation | [1]M. Brack, F. Friedrich, D. Hintersdorf, L. Struppek, P. Schramowski, and K. Kersting. Sega: Instructing text-to-image models using semantic guidance. In A. Oh,T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advancesin Neural Information Processing Systems, volume 36, pages 25365–25389. CurranAssociates, Inc., 2023.
[2] P. Dhariwal and A. Nichol. Diffusion models beat gans on image synthesis. Advancesin neural information processing systems, 34:8780–8794, 2021. [3] M. D’Incà, E. Peruzzo, M. Mancini, D. Xu, V. Goel, X. Xu, Z. Wang, H. Shi, andN. Sebe. Openbias: Open-set bias detection in text-to-image generative models.ArXiv, abs/2404.07990, 2024. [4] R. Gandikota, J. Materzynska, T. Zhou, A. Torralba, and D. Bau. Concept sliders:Lora adaptors for precise control in diffusion models, 2023. [5] E. Härkönen, A. Hertzmann, J. Lehtinen, and S. Paris. Ganspace: Discoveringinterpretable gan controls. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, andH. Lin, editors, Advances in Neural Information Processing Systems, volume 33,pages 9841–9850. Curran Associates, Inc., 2020.25doi:doi:10.6342/NTU202403591 [6] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen.Lora: Low-rank adaptation of large language models, 2021. [7] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Dall·e 2: A moderntext-to-image generation model. arXiv preprint arXiv:2204.06125, 2022. [8] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolutionimage synthesis with latent diffusion models. In Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition, pages 10684–10695,2022. [9] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolutionimage synthesis with latent diffusion models. In Proceedings of the IEEE/CVFconference on computer vision and pattern recognition, pages 10684–10695, 2022. [10] Q. Wu, Y. Liu, H. Zhao, A. Kale, T. M. Bui, T. Yu, Z. Lin, Y. Zhang, and S. Chang.Uncovering the disentanglement capability in text-to-image diffusion models. 2023IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages1900–1910, 2022. [11] L. Zhang, A. Rao, and M. Agrawala. Adding conditional control to text-to-imagediffusion models. In Proceedings of the IEEE/CVF International Conference onComputer Vision, pages 3836–3847, 2023. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94638 | - |
| dc.description.abstract | 本論文研究了 CLIP 文本編碼器在條件擴散模型中的適應,以解決語義編輯和去偏見相關的挑戰。我們探討了透過適應 (Adaptation) 在增強生成圖像語義屬性控制方面的有效性,同時減少內在偏見。研究利用了各種解耦策略,並對文字編碼器進行修改,以評估緩解與性別和種族相關偏見的潛力。我們的研究結果表明,通過針對性適應微調文本編碼器可以顯著提高語義控制的精確性和去偏見的有效性。本研究為圖像合成領域的更公平和可控的生成模型的發展做出了貢獻。 | zh_TW |
| dc.description.abstract | This thesis investigates the adaptation of the CLIP text encoder for use in conditionaldiffusion models to address challenges related to semantic editing and debiasing. We explore the effectiveness of low-rank adaptations in enhancing the control over semanticattributes of generated images while simultaneously reducing inherent biases. The studyutilizes various disentanglement strategies and introduces modifications to the text encoder to evaluate the potential for mitigating biases related to gender and ethnicity. Ourfindings indicate that fine-tuning the text encoder with targeted adaptations can significantly improve semantic control’s precision and debiasing effectiveness. This work contributes to the development of more fair and controllable generative models in the field ofimage synthesis. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:14:31Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-08-16T17:14:31Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Acknowledgements I
摘要 III Abstract V Contents VII List of Figures IX List of Tables XI Denotation XIII Chapter 1 Introduction 1 1.1 Research Objective and Questions . . . . . . . . . . . . . . . . . . . 1 1.2 Methodology Overview . . . . . . . . . . . . . . . . . . . . . . . . 2 Chapter 2 Related Works 3 2.1 Disentanglement in Image Generative Models . . . . . . . . . . . . . 3 2.2 Bias in Diffusion Models . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Guidance Based Methods . . . . . . . . . . . . . . . . . . . . . . . 4 2.4 Semantic Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 3 Background 7 3.1 Diffusion Models . . . . . . . . . .. . . . . . . . . . . . . . . 7 3.2 Low-Rank Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 4 Method 9 4.1 Concepts Editing . . . . . . . .. . . . . . . . . . . . . . . . . . . 10 4.2 Debiasing . . . . .. . . .. . . . . . . . . . . . . . . . . . . . 11 4.2.1 Gender Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2.2 Ethnic Fairness . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 5 Experiments 15 5.1 Concept Adjustment . . . . . . .. . . . . . . . . . . . . . . . . . 15 5.1.1 Linearly Adjustable . . . . . . . . . . . . . . . . . . . . . . . 15 5.1.2 Multiple Concepts Adjustable . . . . . . . . . . . . . . . . 16 5.1.3 Parameter Efficiency . . . . .. . . . . . . . . . . . . . . . . . . 16 5.2 Debiasing . . . . . . . . . . .. . . . . . . . . . . . . . . . 17 5.2.1 Gender Equality . . . . . .. . . . . . . . . . . . . . . . . . . . 18 5.2.2 Ethnic Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2.3 Training Cost . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 6 Conclusion and Discussion 21 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 References 25 | - |
| dc.language.iso | en | - |
| dc.subject | 語意編輯 | zh_TW |
| dc.subject | 去偏見 | zh_TW |
| dc.subject | 圖片生成 | zh_TW |
| dc.subject | LoRA | zh_TW |
| dc.subject | Image Generative | en |
| dc.subject | Semantic Editing | en |
| dc.subject | LoRA | en |
| dc.subject | Debiasing | en |
| dc.title | 透過 CLIP 文字編碼器適應進行條件擴散模型中的語義編輯和去偏見 | zh_TW |
| dc.title | Semantic Editing and Debiasing in Conditional Diffusion Models by CLIP Text-Encoder Adaptation | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 許永真;胡敏君 ;陳駿丞;陳文進 | zh_TW |
| dc.contributor.oralexamcommittee | Yung-Jen Hsu;Min-Chun Hu;Jun-Cheng Chen;Wen-Chin Chen | en |
| dc.subject.keyword | 語意編輯,去偏見,圖片生成,LoRA, | zh_TW |
| dc.subject.keyword | Semantic Editing,Debiasing,Image Generative,LoRA, | en |
| dc.relation.page | 26 | - |
| dc.identifier.doi | 10.6342/NTU202403591 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2024-08-14 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf | 5.69 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
