請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99927| 標題: | 開發一創新多模態深度學習方法提升卵巢腫瘤診斷精準度 Development of A Novel Multi-Modal Deep Learning Approach to Improve Diagnostic Precision in Ovarian Cancer |
| 作者: | 邱柏鈞 Po-Chun Chiu |
| 指導教授: | 盧子彬 Tzu-Pin Lu |
| 關鍵字: | 卵巢腫瘤,多模態深度學習,超音波影像,人工智慧診斷, ovarian tumors,multimodal deep learning,ultrasound imaging,artificial intelligence diagnosis, |
| 出版年 : | 2025 |
| 學位: | 碩士 |
| 摘要: | 背景與目的:卵巢癌是台灣女性生殖器官惡性腫瘤的主要死因之一,早期診斷是對於改善患者預後的重要議題。然而,良性與惡性卵巢腫瘤的治療策略差異極大,術前診斷的準確性直接影響治療決策。傳統超音波診斷高度依賴操作者經驗,存在主觀性與變異性問題。本研究目的為開發一種結合超音波影像與臨床文字報告的多模態深度學習模型,以提升卵巢腫瘤良惡性診斷的精準度。
方法:本論文採用回溯性單中心設計,收集2011年至2021年間台大醫院系統中1,062名接受卵巢腫瘤手術患者的1,342張超音波影像及其相應的超音波文字報告。研究對象依據病理結果分類為良性組(n=612)及惡性組(n=450)後,建構多模態深度學習架構,同時整合DenseNet-121和Swin Transformer用於影像特徵提取,最終採用Bio-Clinical BERT處理臨床文字報告。模型採用受試者層級分層分割,並使用五倍交叉驗證,利用獨立測試集進行預測模型效果評估。 結果:本論文建構之多模態模型在受試者層級分類中達到81.68%準確率、79.38%敏感度、83.81%特異度及0.89的曲線下面積(AUC)。在影像層級分類中表現更佳,準確率為85.15%、敏感度86.73%、特異度83.65%、AUC為0.91。相較於單純影像模型,加入臨床文字資訊能顯著提升診斷效能。在變數重要性上,後向選擇分析顯示在文字報告上的子宮相關描述與卵巢腫瘤特徵描述對最終診斷皆具重要貢獻價值。 結論:本論文成功開發出一套創新整合超音波影像與臨床文字報告的多模態深度學習模型,其診斷效能優於傳統單一模態方法及現有臨床指引。此模型展現了人工智慧輔助診斷在卵巢腫瘤診斷上的應用潛力,有望成為臨床醫師重要的輔助工具,提升術前診斷精準度以改善患者照護品質。 Background and Objectives: In Taiwan, ovarian cancer represents the primary cause of mortality from gynecological malignancies among women, making early detection essential for enhancing patient prognosis. However, treatment strategies for benign versus malignant ovarian tumors differ significantly, making accurate preoperative diagnosis essential for appropriate clinical decision-making. Traditional ultrasound diagnosis is highly operator-dependent, introducing subjectivity and variability. To improve diagnostic precision in ovarian tumor classification, this research focuses on developing a multimodal deep learning system that combines ultrasound images with corresponding clinical text reports. Methods: Our study retrospectively analyzed 1,342 ultrasound images from 1,062 patients who received surgical treatment for ovarian tumors at the National Taiwan University Hospital system during the period from 2011 to 2021. Patients were classified into benign (endometrioma, n=612) and malignant (including borderline, n=450) groups based on pathological confirmation. A multimodal deep learning architecture was developed, incorporating DenseNet-121 and Swin Transformer for image feature extraction and Bio-Clinical BERT for processing clinical text reports. The dataset was split using subject-level stratification with 5-fold cross-validation and a 15% independent test set. Results: The multimodal model achieved superior performance at the subject level with 81.68% accuracy, 79.38% sensitivity, 83.81% specificity, and an area under the curve (AUC) of 0.89. Image-level classification demonstrated even better performance with 85.15% accuracy, 86.73% sensitivity, 83.65% specificity, and an AUC of 0.91. The integration of clinical text information significantly improved diagnostic performance compared to image-only models. Backward selection analysis revealed that both uterine findings and ovarian tumor descriptions contributed synergistically to the final diagnosis. Conclusions: This study successfully developed a multimodal deep learning model that integrates ultrasound images with clinical text reports, demonstrating superior diagnostic performance compared to traditional unimodal approaches and existing clinical guidelines. The model shows promising potential as an AI-assisted diagnostic tool for ovarian tumor classification, offering clinicians a valuable adjunctive tool to improve preoperative diagnostic accuracy and enhance patient care quality. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99927 |
| DOI: | 10.6342/NTU202502500 |
| 全文授權: | 同意授權(限校園內公開) |
| 電子全文公開日期: | 2028-07-24 |
| 顯示於系所單位: | 健康數據拓析統計研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 未授權公開取用 | 1.44 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
