Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101265
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor王鈺強zh_TW
dc.contributor.advisorYu-Chiang Frank Wangen
dc.contributor.author廖宇謙zh_TW
dc.contributor.authorYu-Chien Liaoen
dc.date.accessioned2026-01-13T16:08:43Z-
dc.date.available2026-01-14-
dc.date.copyright2026-01-13-
dc.date.issued2025-
dc.date.submitted2025-12-17-
dc.identifier.citation[1] Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
[2] Omer Antverg and Yonatan Belinkov. On the pitfalls of analyzing individual neurons in language models. In ICLR, 2022.
[3] Ruchika Chavhan, Da Li, and Timothy Hospedales. Conceptprune: Concept editing in diffusion models via skilled neuron pruning. arXiv preprint arXiv:2405.19237, 2024.
[4] Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, and Daniel Cohen-Or. Attend-and-excite: Attention-based semantic guidance for text-to-image diffusion models. In SIGGRAPH, 2023.
[5] Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, and Hengshuang Zhao. Anydoor: Zero-shot object-level image customization. arxiv abs/2307.09481 (2023). In CVPR, 2024.
[6] Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Yonatan Belinkov, Anthony Bau, and James Glass. What is one grain of sand in the desert? analyzing individual neurons in deep nlp models. In Proceedings of the AAAI Conference on Artificial Intelligence, 2019.
[7] Ganggui Ding, Canyu Zhao, Wen Wang, Zhen Yang, Zide Liu, Hao Chen, and Chun-hua Shen. Freecustom: Tuning-free customized image generation for multi-concept composition. In CVPR, 2024.
[8] Jiahua Dong, Wenqi Liang, Hongliu Li, Duzhen Zhang, Meng Cao, Henghui Ding, Salman Khan, and Fahad Shahbaz Khan. How to continually adapt text-to-image diffusion models for flexible customization? In NIPS, 2024.
[9] Nadir Durrani, Hassan Sajjad, Fahim Dalvi, and Yonatan Belinkov. Analyzing individual neurons in pretrained language models. In EMNLP, 2020.
[10] Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, and William Yang Wang. Training-free structured diffusion guidance for compositional text-to-image synthesis. In ICLR, 2023.
[11] Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. An image is worth one word: Personalizing text-to-image generation using textual inversion. In ICLR, 2023.
[12] Rinon Gal, Moab Arar, Yuval Atzmon, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. Encoder-based domain tuning for fast personalization of text-to-image models. In SIGGRAPH, 2023.
[13] Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, et al. Mix-of-show: Decentralized low-rank adaptation for multi-concept customization of diffusion models. In NeurIPS, 2024.
[14] Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. In ICLR, 2022.
[15] Sangwon Jang, Jaehyeong Jo, Kimin Lee, and Sung Ju Hwang. Identity decoupling for multi-subject personalization of text-to-image models. In NIPS, 2024.
[16] Jimyeong Kim, Jungwon Park, and Wonjong Rhee. Selectively informative description can reduce undesired embedding entanglements in text-to-image personalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
[17] Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, Guanying Chen, Wei Liu, and Wenhan Luo. Omg: Occlusion-friendly personalized multi-concept generation in diffusion models. In ECCV, 2024.
[18] Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu. Multi-concept customization of text-to-image diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
[19] Gihyun Kwon, Simon Jenni, Dingzeyu Li, Joon-Young Lee, Jong Chul Ye, and Fabian Caba Heilbron. Concept weaver: Enabling multi-concept fusion in text-to-image models. In CVPR, 2024.
[20] Kimin Lee, Hao Liu, Moonkyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, and Shixiang Shane Gu. Aligning text-to-image models using human feedback. arXiv preprint arXiv:2302.12192, 2023.
[21] Dongxu Li, Junnan Li, and Steven Hoi. Blip-diffusion: Pre-trained subject representation for controllable text-to-image generation and editing. NIPS, 2023.
[22] Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, and Yang Cao. Cones: Concept neurons in diffusion models for customized generation. In ICML, 2023.
[23] Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, and Yang Cao. Cones 2: Customizable image synthesis with multiple subjects. In NeurIPS, 2023.
[24] Wan-Duo Kurt Ma, Avisek Lahiri, John P Lewis, Thomas Leung, and W Bastiaan Kleijn. Directed diffusion: Direct control of object placement through attention guidance. In AAAI, 2024.
[25] Tuna Han Salih Meral, Enis Simsar, Federico Tombari, and Pinar Yanardag. Clora: A contrastive approach to compose multiple lora models. arXiv preprint arXiv:2403.19776, 2024.
[26] Saman Motamed, Danda Pani Paudel, and Luc Van Gool. Lego: Learning to disentangle and invert concepts beyond object appearance in text-to-image diffusion models. arXiv e-prints, 2023.
[27] Ryan Po, Guandao Yang, Kfir Aberman, and Gordon Wetzstein. Orthogonal adaptation for modular customization of diffusion models. In CVPR, 2024.
[28] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand-hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In ICML, 2021.
[29] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
[30] Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In CVPR, 2023.
[31] Viraj Shah, Nataniel Ruiz, Forrester Cole, Erika Lu, Svetlana Lazebnik, Yuanzhen Li, and Varun Jampani. Ziplora: Any subject in any style by effectively merging loras. In European Conference on Computer Vision, 2025.
[32] James Seale Smith, Yen-Chang Hsu, Lingyu Zhang, Ting Hua, Zsolt Kira, Yilin Shen, and Hongxia Jin. Continual diffusion: Continual customization of text-to-image diffusion with clora. In TMLR, 2023.
[33] Xavier Suau, Luca Zappella, and Nicholas Apostoloff. Finding experts in transformer models. arXiv preprint arXiv:2005.07647, 2020.
[34] Gan Sun, Wenqi Liang, Jiahua Dong, Jun Li, Zhengming Ding, and Yang Cong. Create your world: Lifelong text-to-image diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
[35] Mingjie Sun, Zhuang Liu, Anna Bair, and J Zico Kolter. A simple and effective pruning approach for large language models. In ICLR, 2024.
[36] A Vaswani. Attention is all you need. Advances in Neural Information Processing Systems, 2017.
[37] Andrey Voynov, Qinghao Chu, Daniel Cohen-Or, and Kfir Aberman. p+: Extended textual conditioning in text-to-image generation. arXiv preprint arXiv:2303.09522, 2023.
[38] Xiaozhi Wang, Kaiyue Wen, Zhengyan Zhang, Lei Hou, Zhiyuan Liu, and Juanzi Li. Finding skill neurons in pre-trained transformer-based language models. 2022.
[39] Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, and Hongsheng Li. Human preference score: Better aligning text-to-image models with human preference. In ICCV, 2023.
[40] Xun Wu, Shaohan Huang, and Furu Wei. Mixture of lora experts. In ICLR, 2024.
[41] Yang Yang, Wen Wang, Liang Peng, Chaotian Song, Yao Chen, Hengjia Li, Xiaolong Yang, Qinglin Lu, Deng Cai, Boxi Wu, et al. Lora-composer: Leveraging low-rank adaptation for multi-concept customization in training-free diffusion models. arXiv preprint arXiv:2403.11627, 2024.
[42] Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, Zirui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, et al. Scaling autoregressive models for content-rich text-to-image generation. In TMLR, 2023.
[43] Yanbing Zhang, Mengping Yang, Qin Zhou, and Zhe Wang. Attention calibration for disentangled text-to-image personalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024.
[44] Zhengyan Zhang, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. Moefication: Transformer feed-forward layers are mixtures of experts. In ACL, 2022.
[45] Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Chaojun Xiao, Xiaozhi Wang, Xu Han, Zhiyuan Liu, Ruobing Xie, Maosong Sun, and Jie Zhou. Emergent modularity in pre-trained transformers. In ACL, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101265-
dc.description.abstract在實際應用中,如何有效率的不斷更新擴散模型是個難題。因此,我們提出了一個創新的學習策略『萃取概念神經元』(Concept Neuron Selection)。這個簡單且有效的方法,可以在連續學習的條件下對擴散模型實現客製化。CNS 在擴散模型中辨認和目標概念相關的神經元。為了同時避免災難性遺忘以及保留零樣本文字轉圖像的生成能力,CNS 以增量方式微調概念神經元,並共同保留先前學到的概念知識。在現實世界資料集的評估下顯示,CNS 在只需最少參數調整的情況下達到最高的效能,並且無論在單一或多概念客製化任務中皆優於以往方法。CNS同時實現無融合操作,有效降低持續客製化所需的記憶體儲存與處理時間。zh_TW
dc.description.abstractUpdating diffusion models in an incremental setting would be practical in real-world applications yet computationally challenging. We present a novel learning strategy of Concept Neuron Selection, a simple yet effective approach to perform personalization in a continual learning scheme. CNS uniquely identifies neurons in diffusion models that are closely related to the target concepts. In order to mitigate catastrophic forgetting problems while preserving zero-shot text-to-image generation ability, CNS finetunes concept neurons in an incremental manner and jointly preserves knowledge learned of previous concepts. Evaluation of real-world datasets demonstrates that CNS achieves state-of-the-art performance with minimal parameter adjustments, outperforming previous methods in both single and multi-concept personalization works. CNS also achieves fusion-free operation, reducing memory storage and processing time for continual personalization.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2026-01-13T16:08:43Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2026-01-13T16:08:43Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents v
List of Figures vii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2 Related Work 4
2.1 Diffusion model personalization 4
2.2 Neuron selection 6
Chapter 3 Preliminary 8
Chapter 4 Method 10
4.1 Problem formulation and framework overview 10
4.2 Learning of concept neurons 11
4.3 Continual personalization with concept neurons 14
Chapter 5 Experiment 17
5.1 Experimental setup 17
5.2 Qualitative comparisons 18
5.3 Quantitative comparisons 19
5.4 Ablation Study 21
Chapter 6 Conclusion 23
References 24
Appendix A — Implementation Details and More Experiments 31
A.1 Implementation Details 31
A.1.1 Datasets 31
A.1.2 Baseline methods 33
A.1.3 Threshold of the neuron selection 34
A.2 Additional Experimental Results 35
A.2.1 Use of similarity scores 35
A.2.2 Percentage of the updated neurons 35
A.2.3 Percentage of the overlapping between concept neurons 36
A.2.4 Empirical experiment of general neurons 36
A.2.5 Visualization of ablation study 36
A.2.6 Region control 37
-
dc.language.isoen-
dc.subject機器學習-
dc.subject文生圖模型-
dc.subject擴散模型-
dc.subject連續學習-
dc.subject客製化-
dc.subjectMachine Learning-
dc.subjectText-to-image Model-
dc.subjectDiffusion Model-
dc.subjectContinual Learning-
dc.subjectPersonalization-
dc.title擴散模型之連續客製化zh_TW
dc.titleContinual Personalization for Diffusion Modelsen
dc.typeThesis-
dc.date.schoolyear114-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陳祝嵩;楊福恩zh_TW
dc.contributor.oralexamcommitteeChu-Song Chen;Fu-En Yangen
dc.subject.keyword機器學習,文生圖模型擴散模型連續學習客製化zh_TW
dc.subject.keywordMachine Learning,Text-to-image ModelDiffusion ModelContinual LearningPersonalizationen
dc.relation.page39-
dc.identifier.doi10.6342/NTU202500989-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-12-18-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
dc.date.embargo-lift2026-01-14-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-114-1.pdf64.97 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved