基於多實例學習框架下之混合視覺模型方法實現肝細胞癌 BCLC 分期系統

張舜程; Shun-Cheng Chang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96139

標題:	基於多實例學習框架下之混合視覺模型方法實現肝細胞癌 BCLC 分期系統 A Hybrid Vision Model Under the MIL Framework for BCLC Staging of Hepatocellular Carcinoma Using 3D CT
作者:	張舜程 Shun-Cheng Chang
指導教授:	林澤 Che Lin
關鍵字:	醫學影像,肝細胞癌,BCLC 分期,卷積神經網絡,視覺變換器,多實例學習, Medical Imaging,Hepatocellular Carcinoma,BCLCstaging,Convolutional Neural Network,Vision Transformer,Multiple Instance Learning,
出版年 :	2024
學位:	碩士
摘要:	深度學習已經徹底改變了醫學影像領域，提供了更先進的方法來進行精確的診斷和治療計劃。BCLC 分期系統在分期高死亡率的肝細胞癌（HCC）中具有重要作用。自動化的 BCLC 分期系統可以顯著提升診斷和治療計劃的效率。本研究中，我們發現 BCLC 分期與多實例學習（MIL）框架的原理高度契合，這主要是因為 BCLC 分期直接關聯到肝臟腫瘤的大小和數量。為了有效利用這一框架，我們提出了一種新的預處理技術，稱為遮蔽裁剪和填充（MCP），該技術解決了肝臟體積變異性問題，並確保輸入尺寸的一致性。這一技術能夠保留肝臟的結構完整性，從而促進更有效的學習。此外，我們引入了一種名為 ReViT 的全新混合模型，該模型結合了卷積神經網絡（CNN）的局部特徵提取能力與視覺變換器（ViT）的全局上下文建模能力。該模型在 MIL 框架內充分利用了這兩種架構的優勢，從而提供了一種穩健且精確的 BCLC 分期方法。通過採用 Top-K 池化策略，模型能夠聚焦於最具信息量的實例，有效地探索了性能與可解釋性之間的權衡。我們的方法在 BCLC 分期方面表現出比傳統方法更優越的性能和穩健性。這一創新不僅提升了診斷的準確性，還提供了更高的臨床可解釋性，承諾將改善患者的治療結果。 Deep learning has revolutionized medical imaging, offering advanced methods for accurate diagnosis and treatment planning. The BCLC staging system is crucial for staging Hepatocellular Carcinoma (HCC), a high-mortality cancer. An automated BCLC staging system could significantly enhance diagnosis and treatment planning efficiency. In this study, we uncovered that BCLC staging, which is directly related to the size and number of liver tumors, aligns well with the principles of the Multiple Instance Learning (MIL) framework. To effectively utilize the framework, we proposed a new preprocessing technique called Masked Cropping and Padding (MCP), which addresses the variability in liver volumes and ensures consistent input sizes. This technique preserves the structural integrity of the liver, facilitating more effective learning. Furthermore, we introduced ReViT, a novel hybrid model that integrates the local feature extraction capabilities of Convolutional Neural Networks with the global context modeling of Vision Transformers (ViTs). This model leverages the strengths of both architectures within the MIL framework, enabling a robust and accurate approach for BCLC staging. By employing Top-K Pooling strategies, it focuses on the most informative instances within each bag, effectively exploring the trade-off between performance and interpretability. Our approach demonstrates superior performance and robustness in BCLC staging compared to traditional methods. This innovation not only enhances diagnostic accuracy but also offers greater clinical interpretability, promising improved patient outcomes.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96139
DOI:	10.6342/NTU202404487
全文授權:	同意授權(限校園內公開)
電子全文公開日期:	2029-10-23
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf 未授權公開取用	25.05 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。