基於結構對齊優化的免訓練工業異常生成框架

許儒怡; Ruyi Xu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99221

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	鄭文皇	zh_TW
dc.contributor.advisor	Wen-Huang Cheng	en
dc.contributor.author	許儒怡	zh_TW
dc.contributor.author	Ruyi Xu	en
dc.date.accessioned	2025-08-21T16:51:59Z	-
dc.date.available	2025-08-22	-
dc.date.copyright	2025-08-21	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-06	-
dc.identifier.citation	P. Bergmann, K. Batzner, M. Fauser, D. Sattlegger, and C. Steger. The MVTec anomaly detection dataset: a comprehensive real-world dataset for unsupervised anomaly detection. International Journal of Computer Vision, 129(4):1038–1059, 2021. Q. Chen, H. Luo, C. Lv, and Z. Zhang. A unified anomaly synthesis strategy with gradient ascent for industrial anomaly detection and localization. In ECCV. Springer, 2024. X. Chen, L. Huang, Y. Liu, Y. Shen, D. Zhao, and H. Zhao. Anydoor: Zero-shot object-level image customization. In CVPR, 2024. M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In NeurIPS, volume 26, 2013. Y. Deng, X. He, F. Tang, and W. Dong. Z*: Zero-shot style transfer via attention reweighting. In CVPR, 2024. P. Dhariwal and A. Nichol. Diffusion models beat GANs on image synthesis. In NeurIPS, 2021. C. Ding, G. Pang, and C. Shen. Catching both gray and black swans: Open-set supervised anomaly detection. In CVPR, 2022. Z. Du, L. Gao, and X. Li. A new contrastive GAN with data augmentation for surface defect recognition under limited data. IEEE Transactions on Instrumentation and Measurement, 72:1–13, 2023. Y. Duan, Y. Hong, L. Niu, and L. Zhang. Few-shot defect image generation via defect-aware feature manipulation. In AAAI, volume 37, 2023. D. Epstein, A. Jabri, B. Poole, A. Efros, and A. Holynski. Diffusion self-guidance for controllable image generation. In NeurIPS, volume 36, 2023. R. Gal, Y. Alaluf, Y. Atzmon, O. Patashnik, A. H. Bermano, G. Chechik, and D. Cohen-Or. An image is worth one word: Personalizing text-to-image generation using textual inversion. In ICLR. F. Girella, Z. Liu, F. Fummi, F. Setti, M. Cristani, and L. Capogrosso. Leveraging latent diffusion models for training-free in-distribution data augmentation for surface defect detection. In CBMI, IEEE, 2024. G. Gui, B.-B. Gao, J. Liu, C. Wang, and Y. Wu. Few-shot anomaly-driven generation for anomaly classification and segmentation. In ECCV. Springer, 2024. Q. Guo and T. Lin. Focus on your instruction: Fine-grained and multi-instruction image editing by attention modulation. In CVPR, 2024. H. He, Y. Bai, J. Zhang, Q. He, H. Chen, Z. Gan, C. Wang, X. Li, G. Tian, and L. Xie. MambaAD: Exploring state space models for multi-class unsupervised anomaly detection. NeurIPS, 37, 2025. H. He, J. Zhang, H. Chen, X. Chen, Z. Li, X. Chen, Y. Wang, C. Wang, and L. Xie. A diffusion-based framework for multi-class anomaly detection. In AAAI, volume 38, 2024. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. A. Hertz, A. Voynov, S. Fruchter, and D. Cohen-Or. Style aligned image generation via shared attention. In CVPR, 2024. J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. In NeurIPS, 2020. J. Ho, C. Saharia, W. Chan, D. J. Fleet, M. Norouzi, and T. Salimans. Cascaded diffusion models for high fidelity image generation. Journal of Machine Learning Research, 23(47):1–33, 2022. J. Ho and T. Salimans. Classifier-free diffusion guidance. In NeurIPS, 2021. J. Ho and T. Salimans. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022. J. E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, and W. Chen. LoRA: Low-rank adaptation of large language models. In ICLR, 2021. T. Hu, J. Zhang, R. Yi, Y. Du, X. Chen, L. Liu, Y. Wang, and C. Wang. AnomaLyDiffusion: Few-shot anomaly image generation with diffusion model. In AAAI, volume 38, 2024. X. Huang and S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2017. Y. Jin, J. Peng, Q. He, T. Hu, H. Chen, J. Wu, W. Zhu, M. Chi, J. Liu, Y. Wang, and C. Wang. DualAnoDiff: Dual-interrelated diffusion model for few-shot anomaly image generation. arXiv preprint arXiv:2408.13509, 2024. C.-L. Li, K. Sohn, J. Yoon, and T. Pfister. CutPaste: Self-supervised learning for anomaly detection and localization. In CVPR, 2021. Y. Li, H. Liu, Q. Wu, F. Mu, J. Yang, J. Gao, C. Li, and Y. J. Lee. GLIGEN: Open-set grounded text-to-image generation. In CVPR, 2023. D. Lin, Y. Cao, W. Zhu, and Y. Li. Few-shot defect segmentation leveraging abundant defect-free training samples through normal background regularization and crop-and-paste operation. In ICME, 2021. J. Liu, C. Wang, H. Su, B. Du, and D. Tao. Multistage GAN for fabric defect detection. IEEE Transactions on Image Processing, 29:3388–3400, 2020. Z. Liu, Y. Zhou, Y. Xu, and Z. Wang. SimpleNet: A simple network for image anomaly detection and localization. In CVPR, 2023. S. Mo, F. Mu, K. H. Lin, Y. Liu, B. Guan, Y. Li, and B. Zhou. FreeControl: Training-free spatial control of any text-to-image diffusion model with any condition. In CVPR, 2024. C. Mou, X. Wang, J. Song, Y. Shan, and J. Zhang. DragonDiffusion: Enabling drag-style manipulation on diffusion models. In ICLR, 2024. J. Nam, H. Kim, D. Lee, S. Jin, S. Kim, and S. Chang. DreamMatcher: Appearance matching self-attention for semantically-consistent text-to-image personalization. In CVPR, 2024. S. Niu, B. Li, X. Wang, and H. Lin. Defect image sample generation with GAN for improving defect recognition. IEEE Transactions on Automation Science and Engineering, 17(3):1611–1622, 2020. U. Ojha, Y. Li, J. Lu, A. A. Efros, Y. J. Lee, E. Shechtman, and R. Zhang. Few-shot image generation via cross-domain correspondence. In CVPR, 2021. M. Oquab, T. Darcet, T. Moutakanni, H. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby, et al. DINOv2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, 2021. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arXiv:2204.06125, 2022. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. In CVPR, 2022. O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015. K. Roth, L. Pemula, J. Zepeda, B. Schölkopf, T. Brox, and P. Gehler. Towards total recall in industrial anomaly detection. In CVPR, 2022. N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman. DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation. In CVPR, 2023. T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, and X. Chen. Improved techniques for training GANs. In NeurIPS, 2016. H. M. Schlüter, J. Tan, B. Hou, and B. Kainz. Natural synthetic anomalies for self-supervised anomaly detection and localization. In ECCV. Springer, 2022. J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. In ICLR, 2021. Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. In ICLR. Y. Song, Z. Zhang, Z. Lin, S. Cohen, B. Price, J. Zhang, S. Y. Kim, and D. Aliaga. ObjectStitch: Object compositing with diffusion model. In CVPR, 2023. Y. Song, Z. Zhang, Z. Lin, S. Cohen, B. Price, J. Zhang, S. Y. Kim, H. Zhang, W. Xiong, and D. Aliaga. Imprint: Generative object compositing by learning identity-preserving representation. In CVPR, 2024. G. Valvano, A. Agostino, G. De Magistris, A. Graziano, and G. Veneri. Controllable image synthesis of industrial data using stable diffusion. In WACV, 2024. S. Xie, Z. Zhang, Z. Lin, T. Hinz, and K. Zhang. SmartBrush: Text and shape guided object inpainting with diffusion model. In CVPR, 2023. B. Yang, S. Gu, B. Zhang, T. Zhang, X. Chen, X. Sun, D. Chen, and F. Wen. Paint by Example: Exemplar-based image editing with diffusion models. In CVPR, 2023. S. Yang, Z. Chen, P. Chen, X. Fang, Y. Liang, S. Liu, and Y. Chen. Defect Spectrum: A granular look of large-scale defect datasets with rich semantics. In ECCV. Springer, 2024. H. Yao, M. Liu, Z. Yin, Z. Yan, X. Hong, and W. Zuo. GLAD: Towards better reconstruction with global and local adaptive diffusion models for unsupervised anomaly detection. In ECCV. Springer, 2024. X. Yao, R. Li, J. Zhang, J. Sun, and C. Zhang. Explicit boundary guided semi-push-pull contrastive learning for supervised anomaly detection. In CVPR, 2023. Z. You, L. Cui, Y. Shen, K. Yang, X. Lu, Y. Zheng, and X. Le. A unified model for multi-class anomaly detection. Advances in Neural Information Processing Systems, 35:4571–4584, 2022. V. Zavrtanik, M. Kristan, and D. Skočaj. DRAEM: A discriminatively trained reconstruction embedding for surface anomaly detection. In ICCV, 2021. G. Zhang, K. Cui, T.-Y. Hung, and S. Lu. Defect-GAN: High-fidelity defect synthesis for automated defect inspection. In WACV, 2021. H. Zhang, Z. Wu, Z. Wang, Z. Chen, and Y.-G. Jiang. Prototypical residual networks for anomaly detection and localization. In CVPR, 2023. L. Zhang, A. Rao, and M. Agrawala. Adding conditional control to text-to-image diffusion models. In ICCV, 2023. X. Zhang, S. Li, X. Li, P. Huang, J. Shan, and T. Chen. DestSeg: Segmentation guided denoising student-teacher for anomaly detection. In CVPR, 2023. X. Zhang, M. Xu, and X. Zhou. RealNet: A feature selection network with realistic synthetic anomaly for anomaly detection. In CVPR, 2024. Y. Zou, J. Jeong, L. Pemula, D. Zhang, and O. Dabeer. Spot-the-difference self-supervised pre-training for anomaly detection and segmentation. In ECCV. Springer, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99221	-
dc.description.abstract	在工業缺陷檢測任務中，異常樣本生成(Anomaly Generation)對於克服缺陷樣本稀缺的挑戰仍相當關鍵。儘管近期已有諸多進展，現有方法在面對複雜且多樣的缺陷時仍有挑戰，尤其是在每種類型僅有單一異常樣本可用的情境下。為此，本研究提出一種創新的免訓練異常生成架構——TF-IDG(Training-Free Industrial Defect Generation)，能於僅有單一樣本時產生多樣的異常樣本。我們引入「特徵對齊策略」(Feature Alignment Strategy)，以縮小生成缺陷與真實缺陷之間的分佈差距，提升外觀精準度;透過「自適應異常遮罩模組」(Adaptive Anomaly Mask Module)，有效解決小型異常易被忽略的問題，強化合成缺陷與遮罩的一致性; 同時，藉由「紋理保留模組」(Texture Preservation Module)，從正常影像中提取背景線索，確保異常樣本與背景融合自然、逼真。評估結果指出，在 VisA dataset 中，本方法能夠產出更加準確且多樣的異常樣本，在圖像質量指標 Local IS 中達到 3.90，異常分類準確度 (Acc) 達到 79.84%，突破現有先進方法的性能。同時 TF-IDG 顯著提升下游異常檢測模型的效能，提升先進異常偵測模型 5.5%。	zh_TW
dc.description.abstract	Anomaly generation is crucial to address the scarcity of defective samples in industrial anomaly inspection. Despite progress, existing methods struggle with complex, multi-type defects, especially under one-shot settings. We propose TF-IDG, a novel training-free framework that synthesizes diverse anomalies from a single sample. A Feature Alignment strategy minimizes distributional gaps between generated and real defects to refine appearance. The Adaptive Anomaly Mask module mitigates the omission of small defects, improving consistency between synthetic anomalies and masks. To enhance realism, the Texture Preservation module extracts background cues from anomaly-free images. Experiments on the VisA dataset show that TF-IDG generates more accurate and diverse samples, achieving a Local IS score of 3.90 and anomaly classification accuracy of 79.84%, outperforming prior methods. Furthermore, TF-IDG improves downstream anomaly detection, with a 5.5% gain in advanced frameworks.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T16:51:59Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-21T16:51:59Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iv Abstract v Contents vi List of Figures ix List of Tables xi Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Proposed Method 3 1.4 Outline of the Thesis 5 1.5 Publication 6 Chapter 2 Related Works 7 2.1 Image Editing with Diffusion Models 7 2.1.1 Foundations of Diffusion Models 8 2.1.2 Controllable and Customizable Diffusion Models 11 2.2 Anomaly Generation 12 Chapter 3 Method 15 3.1 Score-based guidance in Diffusion Model 15 3.2 Overall Architecture 17 3.2.1 Feature Alignment 19 3.2.2 Adaptive Anomaly Mask guidance 22 3.2.3 Texture Preservation 25 3.2.4 TF-IDG Algorithm Summary 26 Chapter 4 Experiments 28 4.1 Dataset 28 4.1.1 MVTec AD dataset 29 4.1.2 VisA dataset 29 4.2 Experiment Settings 30 4.2.1 Experimental protocols 30 4.2.2 Metric 30 4.2.3 Implementation Details 32 4.3 Comparison in Anomaly Generation 32 4.3.1 Anomaly generation quality 33 4.3.2 Anomaly generation for anomaly inspection 36 4.3.3 Comparison with Anomaly Detection Models 38 4.4 Ablation Study 39 4.5 Resource requirement 42 4.6 Extended Application 43 Chapter 5 Conclusion 48 5.0.1 Contributions 48 5.0.2 Limitations and Future Directions 49 References 50 Appendix A — Full Qualitative Results 58	-
dc.language.iso	en	-
dc.subject	擴散式生成模型	zh_TW
dc.subject	得分函數引導	zh_TW
dc.subject	工業瑕疵生成	zh_TW
dc.subject	工業異常偵測	zh_TW
dc.subject	Industrial Anomaly Detection	en
dc.subject	Diffusion models	en
dc.subject	Industrial Defect Generation	en
dc.subject	Score-based function guidance	en
dc.title	基於結構對齊優化的免訓練工業異常生成框架	zh_TW
dc.title	A Training-Free Industrial Anomaly Generation Framework Based on Structural Alignment Optimization	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	莊永裕;陳駿丞;花凱龍	zh_TW
dc.contributor.oralexamcommittee	Yung-Yu Chuang;Jun-Cheng Chen;Kai-Lung Hua	en
dc.subject.keyword	工業異常偵測,擴散式生成模型,工業瑕疵生成,得分函數引導,	zh_TW
dc.subject.keyword	Industrial Anomaly Detection,Diffusion models,Industrial Defect Generation,Score-based function guidance,	en
dc.relation.page	71	-
dc.identifier.doi	10.6342/NTU202503440	-
dc.rights.note	未授權	-
dc.date.accepted	2025-08-10	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	173.7 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。