請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84505
標題: | 以注意力機制改良物件偵測模型進行腕隧道正中神經超音波影像拍攝條件即時評估 Real-time ultrasound image condition assessment of median nerve in carpal tunnel using improved object detection model with attention mechanism |
作者: | Yi-Ting Hung 洪怡庭 |
指導教授: | 郭柏齡(Po-Ling Kuo) |
關鍵字: | 腕隧道,自動影像條件評估,深度學習,物件偵測,正中神經,超音波影像, carpal tunnel,automatic image condition assessment,deep learning,object detection,median nerve,ultrasonography, |
出版年 : | 2022 |
學位: | 碩士 |
摘要: | 腕隧道症候群是臨床上常見的神經纏套疾病,發生原因為腕隧道內的正中神經受周圍組織壓迫損傷而致。超音波具有非侵入性、病患接受度高等優勢,且近年來證實利用動態超音波可檢測出受纏套之正中神經動態形態特徵有異常表現的相關研究日益增加,使得動態超音波成為一項腕隧道症候群極具潛力的診斷工具。本實驗室先前研究利用深度學習模型進行動態超音波正中神經影像即時分割,可自動描繪神經輪廓並即時完成動態型態數據分析。然而為了獲得準確分割結果並確保後續分析數據可靠,需耗費大量人力事先從蒐集到的研究資料篩去不利於神經分割或後續分析的影像,如正中神經外觀模糊或拍攝之解剖切面不適當的資料,此外,若收案過後才發現蒐集到的資料不利於分析,通常也難以要求受測者回診並重新拍攝,因此若能開發一個平台使得獲取影像的同時即可自動且客觀的判斷影像是否適用於後續分析將會是一個有效的解決方式。因此在本篇研究中,我們即以資料蒐集當下自動完成影像條件評估為目標,提出了基於深度學習的目標檢測框架。有鑑於清晰明確的神經邊緣為獲得準確分割結果的重要條件且在整個影像序列中動態型態參數的評估需基於相同解剖平面,我們將影像中神經外觀清晰以及拍攝於腕隧道近端入口橫截面定義為理想影像須具備的兩項條件,並且以影像中同時存在舟狀骨與豆狀骨認定拍攝橫截面位於腕隧道近端入口處。對於正中神經外觀清晰度條件判定,我們根據其邊緣灰度變化來進行量化,再透過閾值設定將計算出的清晰度分數轉為高、低清晰度的二元分類。對於拍攝解剖平面的判定,為了節省人工標註骨骼標誌的成本,我們利用半監督式學習方法Unbiased Teacher僅使用少量人工標記的資料來自動識別整個數據集中的兩個標誌,該模型在測試集上獲得Precision、Recall、F1-Score分別為0.941、0.971、0.956的表現,並以此模型之預測結果作為骨骼標誌的Ground-truth。而後,我們將完成標記的數據用於訓練及評估端到端的自動影像條件評估模型,此處我們使用較輕量的物件偵測模型EfficientDet-D0、YOLOv5n以及在YOLOv5n的空間金字塔池化層添加注意力機制以增強特徵提取能力的三種改良架構YOLOv5n-SPPSE、YOLOv5n-SPPCBAM、YOLOv5n-SPPmultiCBAM。實驗結果顯示注意力機制的添加成功在維持即時預測的同時再提高了檢測準確度,EfficientDet-D0、YOLOv5n、YOLOv5n-SPPSE、YOLOv5n-SPPCBAM 和 YOLOv5n-SPPmultiCBAM的平均F1-Score分別為0.823、0.868、0.876、0.873及0.883,推論速度(FPS)則分別為23.02、44.78、44.40、42.96、40.71。相較於使用未經影像條件篩選前的資料集,以篩選後的數據集訓練影像分割模型YOLACT在Average IoU及Dice Coefficient各提升了6%及4.5%。綜上所述,我們提出了一個端到端的架構得以即時自動評估影像條件是否適用於神經自動分割及後續資料分析,可望成為使用動態超音波診斷腕隧道症候群時的有效輔助工具。 Carpal tunnel syndrome (CTS) is a common entrapment neuropathy caused by compression of the median nerve by adjacent tissues as it travels through the carpal tunnel. Ultrasonography has the advantages of non-invasiveness and high patient acceptance. An increasing line of evidence indicates that an entrapped median nerve exhibits abnormal morphological dynamics as evaluated by dynamic ultrasonography. This renders dynamic ultrasonography as a promising diagnostic tool for CTS. Our laboratory has previously shown real-time segmentation of median nerve in dynamic ultrasonography using deep-learning models to automatically delineate the nerve contour and analyze the morphological dynamics as the images were acquired. To achieve high segmentation accuracy and ensure the reliability of subsequent analysis, however, it demanded tons of human labor to manually screen out images in conditions unfavorable for nerve segmentation or subsequent analysis from the acquired images, such as images with blurred or vague nerve appearance, or taken at improper anatomical plane. Of note, it is usually difficult to ask the subject to revisit the clinic to reacquire the images if the previously acquired ones were unfavorable for image analysis. One appealing solution is to provide a platform to automatically and objectively determine whether the image is favorable for subsequent image analysis simultaneously when the image is acquired. In the present work, we proposed deep learning-based, object detection frameworks aiming to automatically evaluate the image condition immediately as the image is collected. Given that well-defined nerve boundaries are imperative for correct segmentation and the morphological dynamics should be evaluated at the same anatomical plane throughout the image sequence, we referred to a favorable image condition as an image with clear nerve appearance and taken at the plane transecting the proximal inlet of carpal tunnel, defined by the coexistence of the scaphoid and the pisiform on the image. We quantified the nerve clarity according to the gray-level variation across the nerve boundaries, and converted the calculated numbers to the category of high or low clarity using thresholds. To save the human work for annotation of the bony landmarks, we trained Unbiased Teacher, a semi-supervised learning framework, using a small amount of manually labeled data to automatically recognize the two landmarks in the whole dataset. The model achieved a precision of 0.941, a recall of 0.971, and an F1-score of 0.956 in the test dataset. The predicted results were then utilized as the ground-truth for the existence of the bony landmarks. The data labelled for the favorability of the image condition were then utilized for the training and performance evaluation of the end-to-end, automatic image condition assessment frameworks using lightweight object detection models, consisting of EfficientDet-D0, YOLOv5n, and three architectures modified from YOLOv5n by introducing attention modules into spatial pyramid pooling layer to enhance the extraction of important features, which were named as YOLOv5n-SPPSE, YOLOv5n-SPPCBAM, and YOLOv5n-SPPmultiCBAM. Our results showed that the implementation of the attention mechanisms successfully improved the detection accuracy while maintaining real-time prediction. The achieved F1-scores were 0.823, 0.868, 0.876, 0.873, and 0.883 for EfficientDet-D0, YOLOv5n, YOLOv5n-SPPSE, YOLOv5n-SPPCBAM, and YOLOv5n-SPPmultiCBAM, respectively. The inference speeds were 23.02, 44.78, 44.40, 42.96, 40.71 for EfficientDet-D0, YOLOv5n, YOLOv5n-SPPSE, YOLOv5n-SPPCBAM, and YOLOv5n-SPPmultiCBAM, respectively. When compared with the segmentation performance achieved by YOLACT using dataset without screening for the image condition, the model using the screened dataset increased the performance by 6% and 4.5% in average IoU and Dice coefficient, respectively. Our works provide end-to-end frameworks automatically evaluating the favorability of image condition in real time for the nerve segmentation accuracy and the subsequent data analysis, which is highly potential to be an auxiliary tool for the automatic diagnosis of CTS using dynamic ultrasonography. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84505 |
DOI: | 10.6342/NTU202203652 |
全文授權: | 同意授權(限校園內公開) |
電子全文公開日期: | 2024-09-30 |
顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-2009202215132400.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 5.89 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。