使用超像素追蹤在內視鏡影像中實現可調配式多項式時間複雜度語意分割

Wei-Yao Hong; 洪偉堯

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20410

標題:	使用超像素追蹤在內視鏡影像中實現可調配式多項式時間複雜度語意分割 Configurable Polynomial-Time Semantic Segmentation for Endoscope Video via Superpixel Tracking
作者:	Wei-Yao Hong 洪偉堯
指導教授:	施吉昇(Chi-Sheng Shih)
關鍵字:	語意分割,物體追蹤,超像素,多項式時間複雜度,電腦輔助內視鏡手術, Semantic Segmentation,Object Tracking,Superpixel,Polynomial Time Complexity,Computer-Assisted Endoscopy Surgery,
出版年 :	2020
學位:	碩士
摘要:	內視鏡手術近年來已成為外科手術的新主流，為了更進一步輔助醫療團隊進行手術，透過實境擴增或混合實境眼鏡來幫助醫師追蹤與辨識器官的類別、形狀或位置，是一個近來可實現的應用。然而在醫師戴上眼鏡進行手術的同時，無法同時背負著體積龐大的運算機器，此時就需要以嵌入式系統取代之。本篇論文探討使用低運算量計算設備的內視鏡手術語意分割來判別器官資訊。雖然現在已經有許多關於語意分割的演算法，但都是基於龐大運算量的類神經網路模型，難以實現即時的運算結果。故本篇論文提出了多項式時間複雜度的追蹤語意分割方法，基於圖形處理器計算少量類神經網路模型的預測結果作為關鍵幀，並結合中央處理器的運算資源，追蹤器官邊界、位置，得到器官或內視鏡移動後的語意分割。而追蹤語意分割需要以每個像素為單位的精準度，因此我們對影像使用超像素做預處理，輔助器官追蹤。實驗結果證明提出的方法可以減少類神經網路模型的需求量，每一幀影像中追蹤所需浮點數運算量為DeepLabV3+的1/25900以下，且可同時得到相同準確度的語意分割結果。 Computer-aided endoscopy surgery has become a well-accepted operation practice. To assist surgeons to operate, many applications build Augmented Reality (AR) or Mix Reality (MR) glasses to assist surgeons to recognize the class, shape, or position of organs during operations. However, it is not practical for surgeons to carry an appliance-size computer while operating with AR or MR glasses. Hence, it is desirable and required to use embedded devices to conduct the computation instead of using high-performance computers. In this thesis, we aim to recognizing organs without constantly applying computationally expensive semantic segmentation algorithms. Current researches on semantic segmentation algorithms mostly use Neural Network, which is computationally expensive and can only be applied to GPU. It is not feasible to be deployed on embedded system devices for each frame because of the limited GPU computational resources. Therefore, we propose a semantic segmentation tracking method that has polynomial-time complexity and can be applied on CPU. It tracks a few Neural Network prediction results and generates the semantic segmentation results after organs or endoscope movement. Since tracking semantic segmentation is a pixel-wise accurate task, we use superpixel to preprocess the images to help our tracking task. The experiment results show that the FLOPs (Floating Point Operations) of superpixel tracking is 25,900 times of that while applying the DeepLabV3+ model per frame subject to same accuracy.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/20410
DOI:	10.6342/NTU202004203
全文授權:	未授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-0209202001393300.pdf 未授權公開取用	3.06 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。