反事實生成與無監督異常檢測在多模態醫學影像之應用

連品薰; Pin-Hsun Lian

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94933

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	張瀚	zh_TW
dc.contributor.advisor	Han Chang	en
dc.contributor.author	連品薰	zh_TW
dc.contributor.author	Pin-Hsun Lian	en
dc.date.accessioned	2024-08-21T16:45:10Z	-
dc.date.available	2024-08-22	-
dc.date.copyright	2024-08-21	-
dc.date.issued	2024	-
dc.date.submitted	2024-08-06	-
dc.identifier.citation	1. Goodfellow, I., et al., Generative adversarial networks. Communications of the ACM, 2020. 63(11): p. 139-144. 2. Mirza, M. and S. Osindero, Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014. 3. Isola, P., et al. Image-to-image translation with conditional adversarial networks. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. 4. Zhu, J.-Y., et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. in Proceedings of the IEEE international conference on computer vision. 2017. 5. Odena, A., C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans. in International conference on machine learning. 2017. PMLR. 6. Yang, L., et al., Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 2023. 56(4): p. 1-39. 7. Ho, J., A. Jain, and P. Abbeel, Denoising diffusion probabilistic models. Advances in neural information processing systems, 2020. 33: p. 6840-6851. 8. Saharia, C., et al. Palette: Image-to-image diffusion models. in ACM SIGGRAPH 2022 conference proceedings. 2022. 9. Petit, O., et al. U-net transformer: Self and cross attention for medical image segmentation. in Machine Learning in Medical Imaging: 12th International Workshop, MLMI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, September 27, 2021, Proceedings 12. 2021. Springer. 10. Wilson, G. and D.J. Cook, A survey of unsupervised deep domain adaptation. ACM Transactions on Intelligent Systems and Technology (TIST), 2020. 11(5): p. 1-46. 11. Ganin, Y., et al., Domain-adversarial training of neural networks. Journal of machine learning research, 2016. 17(59): p. 1-35. 12. Shrivastava, A., et al. Learning from simulated and unsupervised images through adversarial training. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. 13. Bousmalis, K., et al. Using simulation and domain adaptation to improve efficiency of deep robotic grasping. in 2018 IEEE international conference on robotics and automation (ICRA). 2018. IEEE. 14. Hoffman, J., et al. Cycada: Cycle-consistent adversarial domain adaptation. in International conference on machine learning. 2018. Pmlr. 15. Verma, S., J. Dickerson, and K. Hines, Counterfactual explanations for machine learning: A review. arXiv preprint arXiv:2010.10596, 2020. 2. 16. Van Looveren, A. and J. Klaise. Interpretable counterfactual explanations guided by prototypes. in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2021. Springer. 17. Lu, D., et al., Reconsidering generative objectives for counterfactual reasoning. Advances in Neural Information Processing Systems, 2020. 33: p. 21539-21553. 18. Sauer, A. and A. Geiger, Counterfactual generative networks. arXiv preprint arXiv:2101.06046, 2021. 19. Wang, Z., J. Chen, and S.C. Hoi, Deep learning for image super-resolution: A survey. IEEE transactions on pattern analysis and machine intelligence, 2020. 43(10): p. 3365-3387. 20. Yuan, Y., et al. Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. in Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2018. 21. Peng, C., et al. Saint: spatially aware interpolation network for medical slice synthesis. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. 22. Zhao, C., et al., Applications of a deep learning method for anti-aliasing and super-resolution in MRI. Magnetic resonance imaging, 2019. 64: p. 132-141. 23. Li, J., J.C. Koh, and W.-S. Lee. HRINet: alternative supervision network for high-resolution CT image interpolation. in 2020 IEEE International Conference on Image Processing (ICIP). 2020. IEEE. 24. Choi, Y., et al. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. 25. Wu, R., et al. Cascade ef-gan: Progressive facial expression editing with local focuses. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. 26. Xia, Y., et al., Local and global perception generative adversarial network for facial expression synthesis. IEEE Transactions on Circuits and Systems for Video Technology, 2021. 32(3): p. 1443-1452. 27. Pumarola, A., et al. Ganimation: Anatomically-aware facial animation from a single image. in Proceedings of the European conference on computer vision (ECCV). 2018. 28. Zhao, Y., et al. A Facial Expression Transfer Method Based on 3DMM and Diffusion Models. in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2024. IEEE. 29. Cohn, J.F., Z. Ambadar, and P. Ekman, Observer-based measurement of facial expression with the Facial Action Coding System. The handbook of emotion elicitation and assessment, 2007. 1(3): p. 203-221. 30. Durán, J.I., R. Reisenzein, and J.-M. Fernández-Dols, Coherence between emotions and facial expressions. The science of facial expression, 2017: p. 107-129. 31. Barrett, L.F., et al., Emotional expressions reconsidered: Challenges to inferring emotion from human facial movements. Psychological science in the public interest, 2019. 20(1): p. 1-68. 32. Robinson, M.D. and G.L. Clore, Belief and feeling: evidence for an accessibility model of emotional self-report. Psychological bulletin, 2002. 128(6): p. 934. 33. Corneanu, C., M. Madadi, and S. Escalera. Deep structure inference network for facial action unit recognition. in Proceedings of the european conference on computer vision (ECCV). 2018. 34. Jacob, G.M. and B. Stenger. Facial action unit detection with transformers. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021. 35. Khorrami, P., T. Paine, and T. Huang. Do deep neural networks learn facial action units when doing expression recognition? in Proceedings of the IEEE international conference on computer vision workshops. 2015. 36. Fujioka, T., et al., Efficient anomaly detection with generative adversarial network for breast ultrasound imaging. Diagnostics, 2020. 10(7): p. 456. 37. Nevitt, M., D. Felson, and G. Lester, The osteoarthritis initiative. Protocol for the cohort study, 2006. 1: p. 2. 38. Park, T., et al. Semantic image synthesis with spatially-adaptive normalization. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. 39. Liu, Z., et al., Large-scale celebfaces attributes (celeba) dataset. Retrieved August, 2018. 15(2018): p. 11. 40. Deng, J., et al. Retinaface: Single-shot multi-level face localisation in the wild. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. 41. Chen, S., et al. Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices. in Biometric Recognition: 13th Chinese Conference, CCBR 2018, Urumqi, China, August 11-12, 2018, Proceedings 13. 2018. Springer. 42. Luo, C., et al., Learning multi-dimensional edge feature-based au relation graph for facial action unit recognition. arXiv preprint arXiv:2205.01782, 2022. 43. Ekman, P. and W.V. Friesen, Felt, false, and miserable smiles. Journal of nonverbal behavior, 1982. 6(4): p. 238-252. 44. Girard, J.M., et al. Reconsidering the Duchenne smile: indicator of positive emotion or artifact of smile intensity? in 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII). 2019. IEEE. 45. Kohler, C.G., et al., Differences in facial expressions of four universal emotions. Psychiatry research, 2004. 128(3): p. 235-244.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94933	-
dc.description.abstract	反事實生成與領域轉換能幫助我們透過比較差異來形成直覺的觀察，非監督學習則能透過直接從資料擷取重要資訊來訓練模型，免除了成對資料的限制以及費時的人工標記作業。這兩種電腦視覺深度學習技術的結合將有助於我們對未解決之科學問題形成新的觀點。本研究以膝關節退化疾病(OA)的核磁共振影像異常偵測以及臉部表情轉換兩項實驗，呈現了非監督反事實生成技術在多模態醫學影像上的應用。本研究提出了結合對抗生成模型以及擴散模型的兩階段影像生成架構，此架構能夠有效的生成高品質之反事實影像。在OA異常檢測任務中，我們意圖透過定位個別受試者與疼痛相關的病變，探究過去研究中主觀認知疼痛程度和影像學證據之間的差異。本實驗中我們採用了多任務生成模型，除了異常偵測之外，模型也實現了影像均向性的增強，以生成高解析度的3D影像來增加診斷和治療的準確度。在臉部表情轉換任務中，我們意圖將中性的臉部影像轉換為愉快的表情，並以臉部動作單元(AU) 對生成之影像進行分析以測試模型的有效性。從中我們標定出了五個顯著和愉快相關的動作單元：臉頰上提 (AU6)、眼瞼瞇起 (AU7)、上唇提升 (AU10)、嘴角上揚 (AU12) 和雙脣分開 (AU25)。實驗證實我們的反事實生成技術能夠全面的掌握AU和情緒之間的關係，並生成高解析度的臉部表情影像。本研究提出的兩階段影像生成架構能讓模型在最大程度地保留骨骼肌肉結構和個人身份特徵的前提下生成有效且高品質的反事實影像。藉由在多模態圖像中生成反事實影像，本研究得以在多元的電腦視覺任務中提升模型的可解釋性。	zh_TW
dc.description.abstract	Counterfactual generation and domain translation are powerful tools in the realm of computer vision, offering the capability to provide intuitive observations through direct comparisons. Furthermore, unsupervised learning techniques facilitate the extraction of valuable information from data autonomously, without the necessity for labor-intense annotation. The convergence of these advanced deep learning algorithms illuminates new perspectives on unresolved scientific challenges. This study illustrates the potential of unsupervised counterfactual generation through two distinct experimental applications: anomaly detection in osteoarthritis (OA) MRI images and facial expression translation. We introduce an innovative two-stage generation framework that synergizes Generative Adversarial Networks (GANs) with diffusion models to enhance the quality of generated counterfactuals. In the OA anomaly detection task, our objective is to address the known discrepancy between perceived pain and radiographic evidence by localizing pain-associated lesions in individual subjects. We employ a multi-task generation model to achieve isotropic enhancement, facilitating a uniformly 3D counterfactual comparison that enables more accurate diagnosis and treatment. In the domain of facial expression generation, we perform translation between neutral and happy expressions. We evaluate the congruence of the generated counterfactuals with original images through action unit (AU) analysis, identifying five action units closely related to happiness: Cheek Raiser (AU6), Lid Tightener (AU7), Upper Lip Raiser (AU10), Lip Corner Puller (AU12), and Lips Parted (AU25). In both experimental contexts, the integration of diffusion models markedly enhances image quality while preserving extraneous anatomical structures in MRI scans and maintaining personal identity in facial images, with the initial GAN outputs serving as conditions. This two-stage framework facilitates a substantial advancement in the generation of high-quality counterfactual images, underscoring the efficacy of combining GANs with diffusion models in tackling various computer vision tasks involving multi-modal images.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-21T16:45:10Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-08-21T16:45:10Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS v LIST OF FIGURES viii LIST OF TABLES x Chapter 1 Introduction 1 1.1 Generative Adversarial Networks 1 1.2 Diffusion Models 3 1.2.1 Foundations of Denoising Diffusion Probabilistic Models 3 1.2.2 Conditional Diffusion Models 4 1.3 Unsupervised Deep Domain Adaptation 5 1.4 Counterfactual Explanation with Generative Models 6 1.5 Super-resolution in Medical Imaging 8 1.6 Unsupervised Anomaly Detection in Medical Imaging 11 1.7 Facial Expression Manipulation 12 1.8 Applications of Action Units on Facial Expression Recognition 13 Chapter 2 3D Pain-Associated Lesion Localization from Knee MRI 17 2.1 OAI Dataset 18 2.2 OAI Data Preprocessing 18 2.3 First-stage GAN Model 19 2.4 Second-stage Diffusion Model 20 2.5 Quality Evaluation of Synthesis Image 22 2.6 Result 24 2.6.1 Image Quality of the Generated Counterfactual Images 24 2.6.2 Visual Comparisons of Isotropic Generated MR Images 26 2.7 Conclusion 29 Chapter 3 Facial Expression Generation and Action Units Analysis 31 3.1 Facial Expression Dataset 32 3.1.1 CelebA 32 3.1.2 AffectNet 33 3.2 Facial Expression Data Preprocessing 33 3.3 Model Architecture: First-stage GAN 34 3.3.1 Objective Function 37 3.4 Model Architecture: Diffusion Model 39 3.4.1 AU Mask Generation 39 3.4.2 Conditional Diffusion Model 40 3.5 Evaluation 43 3.5.1 Class Flip Rate 43 3.5.2 Action Units Evaluation Metrices 44 3.5.3 Image Quality Comparison 45 3.5.4 Human perception on generated counterfactuals 45 3.6 Result 47 3.6.1 GAN-generated counterfactuals 47 3.6.2 Diffusion-generated counterfactuals 47 3.6.3 Class Flip Rate of the Generated Images 51 3.6.4 Identifying Action Units Associated with Smile/Happiness Facial Expression 52 3.6.5 Image Quality Comparison 57 3.6.6 Evaluation of Human Perception of Generated Counterfactuals 57 3.7 Discussion 59 3.7.1 Data Variance in the Datasets of Facial Expression 59 3.7.2 AU Interpretability in Counterfactual Generation 60 3.7.3 Comparison of GAN and diffusion models 63 3.7.4 Limitations 64 3.7.5 Future work 66 Chapter 4 Reference 67	-
dc.language.iso	en	-
dc.subject	面部表情轉換	zh_TW
dc.subject	退化性關節炎	zh_TW
dc.subject	均向性增強	zh_TW
dc.subject	反事實生成	zh_TW
dc.subject	醫學影像異常偵測	zh_TW
dc.subject	medical imaging anomaly detection	en
dc.subject	isotropic enhancement	en
dc.subject	facial expression translation	en
dc.subject	osteoarthritis	en
dc.subject	counterfactual generation	en
dc.title	反事實生成與無監督異常檢測在多模態醫學影像之應用	zh_TW
dc.title	Exploring Unsupervised Anomaly Detection in Multimodal Medical Imaging Through Counterfactual Generation	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	吳文超;梁祥光	zh_TW
dc.contributor.oralexamcommittee	Wen-Chau Wu;Hsiang-Kuang Liang	en
dc.subject.keyword	面部表情轉換,醫學影像異常偵測,反事實生成,均向性增強,退化性關節炎,	zh_TW
dc.subject.keyword	facial expression translation,medical imaging anomaly detection,counterfactual generation,isotropic enhancement,osteoarthritis,	en
dc.relation.page	71	-
dc.identifier.doi	10.6342/NTU202403617	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2024-08-07	-
dc.contributor.author-college	醫學院	-
dc.contributor.author-dept	醫療器材與醫學影像研究所	-
dc.date.embargo-lift	2029-08-06	-
顯示於系所單位：	醫療器材與醫學影像研究所

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 此日期後於網路公開 2029-08-06	4.04 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。