Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84485
Title: | 基於語義分割模型之台灣街景解析優化 The optimization of semantic segmentation model for Taiwan street scene |
Authors: | Yung-Chieh Chang 張詠絜 |
Advisor: | 黃乾綱(Chien-Kang Huang) |
Keyword: | 深度學習,遷移學習,域適應,語義分割,注意力機制, Deep Learning,Transfer Learning,Domain Adaptation,Semantic Segmentation,Attention Mechanism, |
Publication Year : | 2022 |
Degree: | 碩士 |
Abstract: | 近年來人工智慧(Artificial Intelligence)與深度學習(Deep Learning)蓬勃發展,也被廣泛的應用在圖像分割的辨識任務上。從汽車的自動駕駛到醫學上的病因診斷都看得的圖像分割的身影。圖像分割可以被分為兩種類型,實例分割(Instance Segmentation)與語義分割(Semantic Segmentation),而這兩種基礎任務組合在一起則稱為全景分割(Panoptic Segmentation)。本研究主要探討語義分割模型在台灣街景辨識上的應用。 本研究目的為改善語義分割模型常見的問題。一般的語義分割模型在訓練時需要仰賴大量的標記資料。多數標記完成的開源資料集皆以西方國家街景為主要蒐集地區,目前並沒有專門蒐集台灣街景的開源資料集。除了資料蒐集的問題外,許多模型提取的特徵不足,而被忽略的特徵訊息可能導致像素點分類錯誤也會造成細節的預測較不精確。 為改善上述問題,本研究提出改良版域適應模型,域適應網路採用遷移學習的方式來解決無標記資料的訓練。本研究將自行蒐集的無標記台灣日間與夜間街景進行匹配過後加入訓練資料集中,用以改善在台灣街景預測結果。最後在語義分割網路中用特殊的方法引入模塊。在不破壞原有的架構下引入注意力機制模塊來改善類別混淆問題以及引入殘差細化模塊來強化細節的預測。根據實驗結果顯示,本研究之模型與DANNet相比像素準確度提升了2.56%,平均交集聯集比提升了2.31%。 Recently, artificial intelligence and deep learning have flourished, and they have also been widely used in image segmentation, which can be used for automatic driving and medical diagnosis. Image segmentation can be divided into two types, Instance Segmentation and Semantic Segmentation, and the combination of them is called Panoptic Segmentation. The thesis mainly discusses semantic segmentation and the application of Taiwan street scene recognition. The purpose of the thesis is to solve the common problems of semantic segmentation models. General semantic segmentation models rely on a massive amount of labeled data during training. Most of the open-source datasets they used are mainly collected from Western countries. Currently, there is no dataset dedicated to collecting Taiwan street scenes. In addition to data collection problems, many models extract insufficient features, and the ignored information from loosed features leads to incorrect pixel classification and less accurate prediction of details. In order to solve the above problems, we propose an optimized version of the domain adaptation network. The domain adaptation network uses transfer learning, so the training of unlabeled data can be implemented. We collected unlabeled Taiwanese daytime and nighttime street scenes by ourselves and made them aligned. Then, we added them to the training dataset to improve the prediction of Taiwanese street scenes. Furthermore, we introduced the attention module into the semantic segmentation network with a special method without destroying the original architecture to solve the class confusion problem. And the residual refinement module is introduced to strengthen the prediction of details. According to the experimental results, the pixel accuracy of the model in this study is improved by 2.56% compared with DANNet, and the mean intersection over union is improved by 2.31%. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84485 |
DOI: | 10.6342/NTU202203866 |
Fulltext Rights: | 同意授權(限校園內公開) |
metadata.dc.date.embargo-lift: | 2022-09-29 |
Appears in Collections: | 工程科學及海洋工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
U0001-2309202200243600.pdf Access limited in NTU ip range | 3.29 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.