Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94933
Title: | 反事實生成與無監督異常檢測在多模態醫學影像之應用 Exploring Unsupervised Anomaly Detection in Multimodal Medical Imaging Through Counterfactual Generation |
Authors: | 連品薰 Pin-Hsun Lian |
Advisor: | 張瀚 Han Chang |
Keyword: | 面部表情轉換,醫學影像異常偵測,反事實生成,均向性增強,退化性關節炎, facial expression translation,medical imaging anomaly detection,counterfactual generation,isotropic enhancement,osteoarthritis, |
Publication Year : | 2024 |
Degree: | 碩士 |
Abstract: | 反事實生成與領域轉換能幫助我們透過比較差異來形成直覺的觀察,非監督學習則能透過直接從資料擷取重要資訊來訓練模型,免除了成對資料的限制以及費時的人工標記作業。這兩種電腦視覺深度學習技術的結合將有助於我們對未解決之科學問題形成新的觀點。本研究以膝關節退化疾病(OA)的核磁共振影像異常偵測以及臉部表情轉換兩項實驗,呈現了非監督反事實生成技術在多模態醫學影像上的應用。
本研究提出了結合對抗生成模型以及擴散模型的兩階段影像生成架構,此架構能夠有效的生成高品質之反事實影像。在OA異常檢測任務中,我們意圖透過定位個別受試者與疼痛相關的病變,探究過去研究中主觀認知疼痛程度和影像學證據之間的差異。本實驗中我們採用了多任務生成模型,除了異常偵測之外,模型也實現了影像均向性的增強,以生成高解析度的3D影像來增加診斷和治療的準確度。 在臉部表情轉換任務中,我們意圖將中性的臉部影像轉換為愉快的表情,並以臉部動作單元(AU) 對生成之影像進行分析以測試模型的有效性。從中我們標定出了五個顯著和愉快相關的動作單元:臉頰上提 (AU6)、眼瞼瞇起 (AU7)、上唇提升 (AU10)、嘴角上揚 (AU12) 和雙脣分開 (AU25)。實驗證實我們的反事實生成技術能夠全面的掌握AU和情緒之間的關係,並生成高解析度的臉部表情影像。 本研究提出的兩階段影像生成架構能讓模型在最大程度地保留骨骼肌肉結構和個人身份特徵的前提下生成有效且高品質的反事實影像。藉由在多模態圖像中生成反事實影像,本研究得以在多元的電腦視覺任務中提升模型的可解釋性。 Counterfactual generation and domain translation are powerful tools in the realm of computer vision, offering the capability to provide intuitive observations through direct comparisons. Furthermore, unsupervised learning techniques facilitate the extraction of valuable information from data autonomously, without the necessity for labor-intense annotation. The convergence of these advanced deep learning algorithms illuminates new perspectives on unresolved scientific challenges. This study illustrates the potential of unsupervised counterfactual generation through two distinct experimental applications: anomaly detection in osteoarthritis (OA) MRI images and facial expression translation. We introduce an innovative two-stage generation framework that synergizes Generative Adversarial Networks (GANs) with diffusion models to enhance the quality of generated counterfactuals. In the OA anomaly detection task, our objective is to address the known discrepancy between perceived pain and radiographic evidence by localizing pain-associated lesions in individual subjects. We employ a multi-task generation model to achieve isotropic enhancement, facilitating a uniformly 3D counterfactual comparison that enables more accurate diagnosis and treatment. In the domain of facial expression generation, we perform translation between neutral and happy expressions. We evaluate the congruence of the generated counterfactuals with original images through action unit (AU) analysis, identifying five action units closely related to happiness: Cheek Raiser (AU6), Lid Tightener (AU7), Upper Lip Raiser (AU10), Lip Corner Puller (AU12), and Lips Parted (AU25). In both experimental contexts, the integration of diffusion models markedly enhances image quality while preserving extraneous anatomical structures in MRI scans and maintaining personal identity in facial images, with the initial GAN outputs serving as conditions. This two-stage framework facilitates a substantial advancement in the generation of high-quality counterfactual images, underscoring the efficacy of combining GANs with diffusion models in tackling various computer vision tasks involving multi-modal images. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94933 |
DOI: | 10.6342/NTU202403617 |
Fulltext Rights: | 同意授權(全球公開) |
Appears in Collections: | 醫療器材與醫學影像研究所 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-112-2.pdf Until 2029-08-06 | 4.04 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.