單一背景訓練資料之高相容性物體辨識模型

Yung-Sheng Chang; 張詠盛

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21290

Title:	單一背景訓練資料之高相容性物體辨識模型 Versatile Object Detection Model Based on Single Background Data
Authors:	Yung-Sheng Chang 張詠盛
Advisor:	鄭振牟
Co-Advisor:	廖世偉
Keyword:	深度學習,物體辨識,像素直方圖,預選框,物體追蹤,領域適應性, Deep Learning,Object Detection,Pixel’s Histogram,Region Proposal,Visual Tracking,Domain Adaption,
Publication Year :	2019
Degree:	碩士
Abstract:	物體辨識，顧名思義就是在一張影像中判別該物體的位置，大都使用方框框出該位置。最常見的方式分成兩種: 1. 兩階段辨識(例如: Faster R-CNN)，此架構在許多競賽中達到最佳的準確度; 2. 單步驟辨識(例如: YOLO)，主打高效率的面相，擁有快速的圖片處理速度。當我們想訓練這樣的物體辨識模型時，兩者皆需要大量的資料，訓練模型以達到較好的準確度。大型的公開物體辨識資料庫(例如: Google’s Open Images)，該物體擁有高於一百種不同場景的照片．然而，想訓練自己的物體模型，我們很難將想辨識的物體，放置於各式各樣的環境下拍照搜集資料，搜集如此大量資料將會是一大問題。此碩論我們將提出加入物體追中的方式簡化搜集資料過程和基於Faster R-CNN做修正的訓練架構。當使用者想訓練自己的物體識別器，他們只要對該物體錄一段影片。這樣就完成搜集該物體影像資料的過程，即可開始訓練並得到該物體辨識的模型參數。因為全部的訓練資料皆來自相同環境，有很大的成分會過擬和，當環境化時而無法辨識物體的位置。我們所提出的訓練架構加入區域像素顏色密度的分類器，針對複雜的背景提升辨識的準確度。我們搜集實際應用情況的照片來驗證這樣的方法，接下來會有更詳細的說明。 Object detection, as the name goes, involves detecting objects in an image along with their location, typically using a bounded box. The two-stage approach (e.g., Faster R-CNN) has the advantage of providing the highest accuracy, whereas the one-stage approach (e.g., YOLO) has the advantage of high efficiency. Both of them require a lot of data to produce the best results possible. The datasets (e.g., Google’s Open Images) have 100+ images for each object at various occasions. However, if we want to create my own object detector, it is challenging to collect as much data to train my model. In this thesis, we present a new way of data collection and a novel training scheme based on Faster R-CNN. If some people want training their own object detector, they just need to take a short video of the object. We will start training and get a feasible model for their object detector. Since all training data come from a single background, it is at high risk for overfitting. Our training scheme adds histogram classifier in order to prevent that and promote recall when the backgrounds become complicated. Our approach is validated through dataset we collect in real life. The rest will be introduced in details afterwards.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21290
DOI:	10.6342/NTU201903542
Fulltext Rights:	未授權
Appears in Collections:	電機工程學系

Files in This Item:

File	Size	Format
ntu-108-1.pdf Restricted Access	3.26 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets