Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60916
Title: 單樣本物件檢測藉由多功能注意力機制
One-Shot Object Detection Using Versatile Attentions
Authors: Po-Min Hsu
許博閔
Advisor: 吳沛遠(Pei-Yuan Wu)
Keyword: 物件檢測,深度學習,單樣本學習,注意力模型,度量學習,
Object detection,deep learning,one-shot learning,attention model,metric learning,
Publication Year : 2020
Degree: 碩士
Abstract: 在本文中,我們提出了ODVA用於單樣本物件檢測,其中要檢測的目標類別可以是訓練數據集中是沒見過的。 我們的ODVA使用可見類中的圖像進行訓練,而在推論階段中,ODVA會在查詢圖像中檢測與給定支持圖像匹配的對象,包含見過或沒見過的類別且無需進行任何模型微調。借助空間和通道注意力,對支持圖像中的可區分特徵進行編碼,並估算查詢圖像和支持圖像之間的相似度。 從中,基於餘量的損失函數旨在指導ODVA學習針對沒見過類別的合適度量方法。 對VOC和MS-COCO數據集的實驗評估表明,與其他最新的單樣本和元學習文獻相比,本文提出的ODVA是有效的。 此外,為了支持可解釋性,我們將RPN提案區域和注意力向量可視化,並通過消融研究證明ODVA中每個模塊的有效性。
In this thesis, we propose ODVA for one-shot object detection, in which the object to be detected can be unseen in the training dataset. Our ODVA is trained with images in the seen classes, while in the inference phase ODVA detects objects in the query image that match a given support image containing an unseen class without any fine-tuning. With the help of spatial and channel attentions, distinguishable features in the support image are encoded and the similarity between query and support images is estimated. From which, a margin-based loss is designed to guide ODVA into learning an appropriate metric for the unseen classes. Experimental evaluations on both VOC and MS-COCO datasets show the effectiveness of the proposed ODVA compared to other start-of-the-art one-shot and meta-learning works. In addition, to favor interpretability, we visualize the RPN proposals and attention vectors, and demonstrate the effectiveness of each module in ODVA through ablation study.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60916
DOI: 10.6342/NTU202001254
Fulltext Rights: 有償授權
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
U0001-0107202022242500.pdf
  Restricted Access
15.02 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved