Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95680
Title: | 以空間可變形卷積核為基礎之 U-Net 模型及其於醫療影像分割之應用 SDUNet: A Spatial Deformable Kernel-based U-Net for Medical Image Segmentation |
Authors: | 簡弘宇 Hung-Yu CHIEN |
Advisor: | 藍俊宏 Jakey Blue |
Keyword: | 醫學影像分割,U-Net,可變形卷積核,中輟區塊,知識蒸餾,損失函數, Medical Image,Semantic Segmentation,U-Net,Deformable Convolution Kernel,DropBlock,Knowledge Distillation,Loss Function, |
Publication Year : | 2024 |
Degree: | 碩士 |
Abstract: | 影像分割一直是電腦視覺技術發展中一個重要的任務,隨著硬體運算和模型架構研究的演進,深度學習模型用來協助或取代既往的判別演算法,並有效提升效率及準確度。而醫學影像都由醫師人工標註,除了耗費大量寶貴專業醫師的時間精力外,亦容易產生個體主觀上的標準差異,因此發展深度學習判讀影像內容以提供基礎分割結果,輔助醫生判決,成為應用AI到醫療領域中一重要的發展方向。
卷積類神經網路的效率、表現和泛化能力,使其儼然成為現今處理影像任務之主流算法,但其較小且固定為方形之感受野,往往無法有效學習物體的幾何形變。為了提升處理目標物件之幾何形變,往後常見之作法可分成資料擴增和個案相關之特殊算法,然而前者將導致更多的時間成本,後者則缺乏一般性,故2017年可變形卷積被提出。為了增強可變形卷積之區域空間特性,本研究提出一個空間延伸演算法與架構,稱之為空間可變形卷積核U-Net (SDUNet),希望透過偏移量之學習和提升,找到偏移量之偽標籤,使得模型關注更理想的採樣點,以提升模型成效,並視覺化偏移量結果,分析和比較SDUNet之成效。 本研究著眼於影像分割任務,改良可變形卷積神經網路、結合基礎U-Net,並在損失函數加入偏移量蒸餾,使模型能有效學習偏移量,提高遮罩預測準確度。SDUNet旨在改進可變形卷積核,使其在低運算成本下,能夠規範偏移量和增加區域性空間資訊;其二於U-Net的U型架構加入中輟區塊和點卷積,避免過度依賴部分特徵並增加通道資訊;其三並為加入偏移量蒸餾,使得原本已知的偏移量向偽標籤學習,透過損失函數傳遞,動態且連續地聚焦於有效採樣點。 為驗證SDUNet的效能,本論文分別使用國內某醫院的內臟脂肪影像和視網膜血管之公開資料集,評估空間可變形卷積核U-Net在兩資料集上的預測表現並比較蒸餾之偽標籤和原模型之差異,討論卷積核偏移量之意義。經消融實驗驗證,SDUNet作為基礎的可變形卷積網路結構,在模型表現上,將視網膜血管資料集的Dice係數從86.82%提升至88.83%、IoU係數從78.59%提升至81.24%;內臟脂肪資料的Dice係數則從93.27%提升至94.02%、IoU係數由89.07%提升至89.85%。 Image segmentation has always been an important task in the development of computer vision technology. With the evolution of hardware computing and model architecture research, deep learning models are increasingly used to assist or replace traditional classification algorithms, effectively improving efficiency and accuracy. In medical imaging, annotations are typically done manually by physicians, which not only consumes a significant amount of valuable professional time and effort but also introduces subjective variability in standards. Therefore, developing deep learning methods to interpret image content and provide basic segmentation results to assist doctors has become a crucial direction in applying AI to the medical field. The efficiency, performance, and generalization capability of Convolutional Neural Networks (CNN) have made them the mainstream algorithms for image processing tasks. However, their relatively small and fixed square receptive fields often fail to effectively learn geometric transformations of objects. To improve the handling of geometric transformations of target objects, common approaches include data augmentation and case-specific algorithms. However, the former leads to higher time costs, while the latter lacks generality. In 2017, Deformable Convolution Network (DCN) were introduced to address this issue. To enhance the spatial characteristics of deformable convolutions, this study proposes a spatial extension algorithm and architecture called Spatial Deformable Convolutional U-Net (SDUNet). By learning and improving the offsets to find pseudo-labels for these offsets, the model can focus on more ideal sampling points, improving performance and visualizing offset results to analyze and compare the effectiveness of SDUNet. This research focuses on image segmentation tasks, improving deformable convolutional neural networks by combining them with the basic U-Net and incorporating offset distillation into the loss function. This enables the model to effectively learn offsets and improve mask prediction accuracy. SDUNet aims to enhance deformable convolution kernels by regulating offsets and increasing local spatial information at low computational costs; secondly, it integrates dropout blocks and point convolutions into the U-shaped structure of U-Net to avoid over-reliance on certain features and increase channel information; thirdly, it introduces offset distillation to allow known offsets to be learned as pseudo labels, dynamically and continuously focusing on effective sampling points through the loss function. To validate the performance of SDUNet, this study uses visceral fat images from a domestic hospital and a public dataset of retinal vessels to evaluate the predictive performance of the Spatial Deformable Convolutional U-Net on these datasets, comparing the differences between distilled pseudo labels and the original model, and discussing the significance of convolution kernel offsets. Through ablation experiments, it is verified that SDUNet, as a basic deformable convolutional network structure, improves model performance, increasing the Dice coefficient of the retinal vessel dataset from 86.82% to 88.83% and the IoU coefficient from 78.59% to 81.24%; for the visceral fat data, the Dice coefficient increases from 93.27% to 94.02% and the IoU coefficient from 89.07% to 89.85%. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95680 |
DOI: | 10.6342/NTU202402447 |
Fulltext Rights: | 同意授權(限校園內公開) |
Appears in Collections: | 統計碩士學位學程 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-112-2.pdf Access limited in NTU ip range | 2.23 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.