透過 CLIP 文字編碼器適應進行條件擴散模型中的語義編輯和去偏見

鄭廷瑋; Ting-Wei Cheng

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94638

標題:	透過 CLIP 文字編碼器適應進行條件擴散模型中的語義編輯和去偏見 Semantic Editing and Debiasing in Conditional Diffusion Models by CLIP Text-Encoder Adaptation
作者:	鄭廷瑋 Ting-Wei Cheng
指導教授:	吳家麟 Ja-Lin Wu
關鍵字:	語意編輯,去偏見,圖片生成,LoRA, Semantic Editing,Debiasing,Image Generative,LoRA,
出版年 :	2024
學位:	碩士
摘要:	本論文研究了 CLIP 文本編碼器在條件擴散模型中的適應，以解決語義編輯和去偏見相關的挑戰。我們探討了透過適應 (Adaptation) 在增強生成圖像語義屬性控制方面的有效性，同時減少內在偏見。研究利用了各種解耦策略，並對文字編碼器進行修改，以評估緩解與性別和種族相關偏見的潛力。我們的研究結果表明，通過針對性適應微調文本編碼器可以顯著提高語義控制的精確性和去偏見的有效性。本研究為圖像合成領域的更公平和可控的生成模型的發展做出了貢獻。 This thesis investigates the adaptation of the CLIP text encoder for use in conditionaldiffusion models to address challenges related to semantic editing and debiasing. We explore the effectiveness of low-rank adaptations in enhancing the control over semanticattributes of generated images while simultaneously reducing inherent biases. The studyutilizes various disentanglement strategies and introduces modifications to the text encoder to evaluate the potential for mitigating biases related to gender and ethnicity. Ourfindings indicate that fine-tuning the text encoder with targeted adaptations can significantly improve semantic control’s precision and debiasing effectiveness. This work contributes to the development of more fair and controllable generative models in the field ofimage synthesis.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94638
DOI:	10.6342/NTU202403591
全文授權:	同意授權(全球公開)
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf	5.69 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。