請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100134| 標題: | 使用多模態資料的深度學習架構以早期預測院外心跳停止患者的神經學預後 Deep Learning Framework Using Multimodal Data to Early Predict Neurological Outcomes in OHCA Patients |
| 作者: | 蔡欣翰 Hsin-Han Tsai |
| 指導教授: | 周承復 Cheng-Fu Chou |
| 共同指導教授: | 王偉仲 Weichung Wang |
| 關鍵字: | 頭部電腦斷層,灰白質比例,缺氧缺血性腦病變,院外心跳停止,重新建立自發性循環,臨床醫療決策,多模態模型, Head computered tomography,Gray‐white matter ratio,Hypoxic‐ischemic encephalopathy,Out‐of‐hospital cardiac arrest,Return of spontaneous circulation,Clinical decision making,Multimodal model, |
| 出版年 : | 2025 |
| 學位: | 博士 |
| 摘要: | 院外心跳停止(Out-of-hospital cardiac arrest, OHCA)患者即使接受積極的復甦治療,出院時仍常伴隨不良的神經學預後,主要原因為缺氧性缺血性腦病變 (hypoxic–ischemic encephalopathy)。若能在早期辨識出無法恢復意識的患者,對臨床決策至關重要。然而,目前多數研究僅依賴電子病歷(Electronic Medical Records, EMR)資料,例如基本人口學特徵、急救相關變數與實驗室數據,鮮少納入影像資料,尤其針對多模態融合的研究更為稀少。因此,本研究旨在建構一 個多模態預測架構,分別以聯合融合(joint fusion)與後期融合(late fusion)策略,同時整合 EMR 與頭部電腦斷層影像(Head CT, HCT),以提升預測準確性。
本研究為回溯性分析,納入於心跳自主恢復(Return of Spontaneous Circu- lation, ROSC)後 12 小時內接受腦部 CT 掃描之 OHCA 患者。主要預後指標為 出院時的神經功能表現,良好預後定義為大腦表現分類(Cerebral Performance Category, CPC)為 1 或 2,不良預後則為 CPC 為 3、4 或 5。我們改良先前提出的 灰白質比例(Gray–White Matter Ratio, GWR)計算方法,包含兩階段影像對位流程與應用對比強化(CLAHE)後的 K-means 分群,以提升灰白質分辨能力。在 HCT 建模方面,本研究使用兩種深度學習架構:基於卷積神經網路的 ResNet50 與 基於 Transformer 的 Vision Transformer (ViT)。此外,我們將 GWR 納入 EMR 資料 中,並應用了基於 Transformer 的表格基礎模型 TabPFN。最終,透過 joint fusion 與 late fusion 策略整合 HCT 與 EMR 流程,建構多模態預測模型。 本研究共納入三間醫院的資料,分別為台大醫院(NTUH, N=834)、台大新竹分院(NTUH-HC, N=130)與台大雲林分院(NTUH-YL, N=178)。我們在 NTUH 資料進行訓練與驗證,並以 NTUH-YL 與 NTUH-HC 作為外部測試集。改 良後之 GWR 方法在所有資料集皆達到較佳的 AUC 表現(NTUH: 0.79 vs. 0.77; NTUH-YL: 0.82 vs. 0.78;NTUH-HC: 0.88 vs. 0.82),顯示其改良效益。 在單模態模型中,與傳統機器學習方法相比,ViT 與 TabPFN 皆展現出最佳表現,分別在 NTUH、NTUH-YL 與 NTUH-HC 的 AUC 為:ViT:0.88、0.83、 0.89;TabPFN:0.86、0.79、0.81。將 GWR 納入 EMR 後,模型表現進一步提升, 此方法亦可視為 early fusion 的多模態模型(Early-GWR2stage-TabPFN 於 NTUH、 NTUH-YL 與 NTUH-HC 的 AUC 分別為 0.89、0.84 與 0.88)。 在所有模型中表現最佳者為 Late-ViT-TabPFN,其於 NTUH、NTUH-YL 與 NTUH-HC 的 AUC 分別為 0.92(95% CI: 0.86–0.97)、0.87(95% CI: 0.80–0.94) 與 0.95(95% CI: 0.90–0.98)。當特異度設定為 95% 時,其於雲林與新竹分院資料集中的敏感度分別達到 0.56 與 0.59。此外,透過可解釋性方法(如 SHAP 和 Integrated Gradients)視覺化多模態模型的影像輸入,可發現模型聚焦於與 GWR 計算相關之腦區(如 PU 與 PIC),以及枕葉與大腦皮質區域,與先前研究結果一致,提升模型解釋性與可信度。 本研究所提出之 ViT 模型展現出優異的神經學預後預測能力,突顯在進行目 標溫控治療(Targeted Temperature Management, TTM)前,影像資料於早期預測中之價值。整體而言,多模態模型於所有評估中皆優於單模態與 early fusion 模型,顯示在臨床預後預測領域中,多模態整合方法的重要性與應用潛力,為未來研究之重要方向。 Out-of-hospital cardiac arrest (OHCA) patients often experience poor neurological outcomes at discharge, even after receiving active resuscitation. The primary cause is hypoxic–ischemic encephalopathy. Early identification of patients who are unlikely to regain consciousness is crucial for clinical decision-making. However, most existing studies rely solely on electronic medical record (EMR) data—such as demographic information, prehospital variables, and laboratory results—while imaging data are rarely incorporated. In particular, studies involving multimodal fusion remain scarce. This study aims to develop a multimodal predictive framework that integrates EMR and head CT (HCT) data using joint and late fusion strategies to improve predictive accuracy. This retrospective study included OHCA patients who underwent brain CT scanning within 12 hours after return of spontaneous circulation (ROSC). The primary outcome was the neurological status at hospital discharge, with a favorable outcome defined as a Cerebral Performance Category (CPC) score of 1 or 2, and a poor outcome defined as CPC scores of 3, 4, or 5. We improved a previously proposed method for calculating the gray–white matter ratio (GWR) by introducing a two-stage image registration process and applying contrast-limited adaptive histogram equalization (CLAHE) followed by K- means clustering to better differentiate gray and white matter. For HCT modeling, we adopted two deep learning architectures: ResNet50, based on convolutional neural networks, and Vision Transformer (ViT), based on the Transformer framework. Additionally, GWR was incorporated into EMR features to build early fusion models, and we evaluated a Transformer-based tabular foundation model, TabPFN. Finally, we developed multimodal models by combining EMR and HCT features using both joint and late fusion strategies. The study included data from three hospitals: National Taiwan University Hospital (NTUH, N = 834), NTUH-Hsinchu Branch (NTUH-HC, N = 130), and NTUH-Yunlin Branch (NTUH-YL, N = 178). Model training and evaluation were performed on the NTUH cohort, while NTUH-YL and NTUH-HC were used as external test sets. The improved GWR method outperformed the previous version in all datasets (AUC: NTUH 0.79 vs. 0.77; NTUH-YL 0.82 vs. 0.78; NTUH-HC 0.88 vs. 0.82), confirming its enhancement. Among unimodal models, ViT and TabPFN achieved the best performance when compared with traditional machine learning methods. ViT achieved AUCs of 0.88, 0.83, and 0.89 in the NTUH, NTUH-YL, and NTUH-HC cohorts, respectively, while TabPFN achieved 0.86, 0.79, and 0.81. Incorporating GWR into EMR features (i.e., Early-GWR2stage-TabPFN) further improved model performance (AUCs: 0.89, 0.84, and 0.88), demonstrating the benefits of early fusion. The best-performing multimodal model was Late-ViT-TabPFN, which achieved AUCs of 0.92 (95% CI: 0.86–0.97), 0.87 (95% CI: 0.80–0.94), and 0.95 (95% CI: 0.90–0.98) in the NTUH, NTUH-YL, and NTUH-HC cohorts, respectively. When setting specificity at 95%, this model achieved sensitivities of 0.56 and 0.59 in NTUH-YL and NTUH-HC, respectively. Interpretation of the multimodal model using Integrated Gradients revealed that important brain regions—such as the putamen (PU), posterior limb of the internal capsule (PIC), occipital lobe, and cortex—were consistent with prior findings, enhancing the credibility of the model. The ViT-based HCT model demonstrated strong performance in predicting neurological outcomes, highlighting the value of incorporating imaging data for early prognosis before targeted temperature management (TTM). Overall, multimodal models outperformed both unimodal and early fusion models in all evaluations, underscoring the increasing importance and practical value of multimodal approaches in clinical outcome prediction and establishing a promising direction for future research. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/100134 |
| DOI: | 10.6342/NTU202502309 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 未授權公開取用 | 8.41 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
