請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92996| 標題: | 胸部X光影像模型的泛化研究:使用COVID-19案例探討 A model-generalization study with chest X-ray images: a case study on COVID-19 |
| 作者: | 陳冠穎 Guan-Ying Chen |
| 指導教授: | 林致廷 Chih-Ting Lin |
| 關鍵字: | 胸部X光,新型冠狀病毒,對比學習,對比損失,模型泛化, chest X-ray,COVID-19,contrastive learning,contrastive loss,model generalization, |
| 出版年 : | 2024 |
| 學位: | 博士 |
| 摘要: | 全球大流行病對於創建可靠的胸部X光(CXR)診斷模型帶來了很大的挑戰。2019年新型冠狀病毒(COVID-19)爆發以來,電腦視覺領域紛紛將傳統機器學習與深度學習模型適應到COVID-19檢測任務上,帶來出色的表現。然而,由於疫情突然爆發,加上收集和標註COVID-19 CXR數據需要大量的人力和時間資源,這些模型通常只使用有限的數據進行開發,導致大多數模型傾向於識別與疾病無關而與數據集相關的特徵,這使得這些模型難以廣泛應用。
處在後疫情時代,我們重新蒐集資料量相對足夠的COVID-19訓練資料集與兩個外部測試資料集,審視影響模型泛化的因素。我們將重點放在於尋找圖像空間和頻率域中可能的資料偏差來源。在空間域中,我們使用兩種肺部分割模型(傳統肺分割模型與生成式肺分割模型),比較不同的肺部分割方式去除背景雜訊對模型泛化的幫助;在頻域,我們則是使用離散小波轉換(DWT)或二分複小波轉換(DTCWT),產生不同的頻域組合,藉此排除具有噪聲的頻段。實驗結果顯示,使用兩種肺部分割模型共同協作,並使用小波傳換去除高頻訊號,深度學習模型和傳統機器學習模型在兩個外部資料集測試的平均準確率相較於用原始圖片訓練的模型,分別約提升7%與16%。但我們也指出,這樣的架構並非完美無缺,肺部分割模型也會受其訓練資料集的偏差影響,錯誤的肺部分割結果會導致後續特徵萃取與分類效果不佳。 為了使模型能夠學習到目標(COVID-19)數據集的穩健特徵,我們提出了一種新的訓練方法,稱為多任務監督對比學習(MTSCL)。利用大型胸部X光公開數據集的多樣性,MTSCL可藉著對比來自大型數據集和目標數據集的圖像表示,幫助模型專注於目標數據集中與疾病相關的特徵。我們使用MTSCL訓練肺部異常分類模型,並將其結合樹模型形成一個兩階段分層分類框架用於COVID-19分類任務。研究結果顯示,兩階段分層分類框架在兩個外部資料集的平均準確率達到了75.19%,優於最先進的深度學習模型(74.08%)和傳統機器學習模型(68.67%)。並且經測試,第一階段的MTSCL模型檢測肺部異常的性能優於其他學習框架訓練的模型,平均AUC提升率為1.03%-5.33%。而第二階段,通過MTSCL產生的肺部病理區域分割遮罩,並配合小波濾波進行圖像處理,提高了樹模型區分COVID-19和典型肺炎的泛化性能,平均AUC分數提升了12.38%。結果顯示,MTSCL架構可以使模型學習到資料有限的目標資料庫的關鍵特徵,幫助提升模型泛化到其他資料的能力。 The global pandemic has presented challenges in creating reliable chest X-ray (CXR) diagnostic models. Since the outbreak of COVID-19 in 2019, the field of computer vision has been adapting traditional machine learning and deep learning models for COVID-19 detection, yielding impressive performances. However, these models are often developed using limited data at the early stage of the pandemic. As a result, most models tend to identify features unrelated to the disease but specific to the datasets, making it challenging for these models to be widely applicable. In the post-pandemic era, we re-collected a sufficiently large training dataset and two external testing datasets to examine factors influencing model generalization. We focus on identifying potential sources of data bias in both the spatial and frequency domains of the images. In the spatial domain, we employ two lung segmentation models (traditional and generative lung segmentation models). We compare their abilities to remove background noise. In the frequency domain, we utilize discrete wavelet transform (DWT) or dual-tree complex wavelet transform (DTCWT) to generate different combinations of frequency sub-bands, aiming to eliminate noisy signals from images. The experimental results demonstrate that by the collaborative efforts of the two lung segmentation models and utilizing wavelet transformation for image processing, the deep learning models and classical machine learning models achieve an average accuracy improvement of approximately 7% and 16%, respectively. However, we also note that such an architecture can be flawed, as the lung segmentation models can be biased with their training datasets, leading to suboptimal performance in feature extraction and classification. To enable the model to learn robust features from the target (COVID-19) dataset, we propose a novel training method called Multi-Task Supervised Contrastive Learning (MTSCL). Leveraging the diversity of large-scale public CXR datasets, MTSCL helps the model focus on disease-related features in the target dataset by contrasting image representations from the large-scale and the target datasets. We use MTSCL to train a lung abnormality classification model and combine it with tree models to form a two-stage hierarchical classification framework for COVID-19 classification tasks. The research results demonstrate that the framework achieves an average accuracy of 75.19% on the two external datasets, outperforming state-of-the-art deep learning models (74.08%) and traditional machine learning models (68.67%). Additionally, in the first stage of the framework, the MTSCL model outperforms in detecting lung abnormalities compared to models trained with other learning frameworks. The average AUC improvement rate is 1.03%-5.33%. In the second stage, the lung lesion masks generated by MTSCL, combined with wavelet filtering for image processing, enhance the generalization performance of the tree model in distinguishing COVID-19 from typical pneumonia. It achieves an average AUC score improvement of 12.38%. These results demonstrate that the MTSCL architecture enables the model to learn critical features from a limited target dataset, thereby enhancing the model's ability to generalize with other data. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92996 |
| DOI: | 10.6342/NTU202401567 |
| 全文授權: | 未授權 |
| 顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 5.92 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
