請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7589
標題: | 基於深度學習的物件追蹤算法與海豚偵測與識別 Deep Learning Based Algorithms for Object Tracking and Dolphin Detection and Identification |
作者: | Hung-Wei Hsu 許宏瑋 |
指導教授: | 丁建均 |
關鍵字: | 物件追蹤,卷積遞歸神經網路,物件偵測,圖像分類,細粒度圖像分類,DenseNet,媽祖魚, visual tracking,convolutional recurrent neural network,detection,classification,Taiwanese Humpback Dolphin,DenseNet,saliency, |
出版年 : | 2018 |
學位: | 碩士 |
摘要: | 本論文包含兩部分,第一部分有關class-agnostic tracking。本論文提出兩個演算法,希望透過深度學習對於時序和影像資料的豐富特徵,解決目前class-agnostic tracking領域的問題。第二部分是將電腦視覺技術應用在環境保育領域,針對海豚資料的偵測及辨識。
在trakcing上,本論文中提出FasterMDNet和RDisp。FasterMDNet將MD-Net中時間複雜度極高的online training以根據RNN為基礎的模型更新策略取代,並以重複的online training和back-propagation through time (BPTT)來訓練。Faster-MDNet比起MDNet,時間節省大約十倍左右。RDisp將已訓練好的物件偵測模型,加上ConvRNN,建立追蹤模型。此外,由於BPTT在訓練RDisp的缺陷,本論文提出以兩階段片段訓練取代BPTT來訓練RDisp。RDisp在GPU上的執行速度大約是25 fps,並能夠克服多種常見的影像變化。 在海豚偵測與辨識上,主要的兩個演算法是Faster-RCNN和DenseNet。此外,在海豚名稱辨識上,單純的海浪背景讓訓練好的模型無法著重在海豚本身的細節特徵上。本論文提出結合基於深度學習和基於規則的saliency領域的演算法,來偵測海豚的區域,並把海面部分刪除,除去海面的影響。 The first topic is class-agnostic visual tracking. The proposed algorithms attempt to tackle this problem via strong representative ability of temporal-spatial information in deep-learning techniques. The second topic is dolphin detection and identification. The FasterMDNet and the RDisp are proposed. The FasterMDNet replaces computation-costly online training by an RNN model adaptation and trains RNN model adaptation by repeated online training via back-propagation through time (BPTT). The temporal cost is reduced to around 10 times faster than MD-Net with little sacrificed on accuracy. The RDisp incorporates pretrained detection model with ConvRNN cells. A two-stage clip training is proposed to replace BPTT in training to solve some defects of BPTT. The RDisp runs at 25 frames per second with consistency under multiple circumstances. In the dolphin detection and identification, the trunks are the Faster-RCNN and the Densenet. In classification of dolphin names, interference of sea surfaces distracts the model from details of dolphins. The ensemble of deep-learning based and rule-based saliency detection algorithms with a soft gaussian threshold is proposed to create the dolphin mask to remove the pixels of sea surfaces. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7589 |
DOI: | 10.6342/NTU201801036 |
全文授權: | 同意授權(全球公開) |
電子全文公開日期: | 2023-06-26 |
顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-107-1.pdf | 9.1 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。