Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79072
標題: 透過知識蒸餾學習暨權重平均池化方法於線上行人重識別系統

Self-Knowledge Distillation and Re-weighted Average Pooling for Person Re-identification
作者: Yan-Ting Lin
林彥廷
指導教授: 傅立成(Li-Chen Fu)
關鍵字: 深度學習,資料檢索,行人重識別,
Deep learning,Information retrieval,Person re-identification,
出版年 : 2021
學位: 碩士
摘要: 近年來行人重識別領域受到越來越多的關注,主要是因為深度學習的盛行與其帶來廣大的應用場域,像是智慧家庭、健康照護以及監視系統。行人重識別的困難點在於環境造成的因素居多,像是背景的雜亂、不同視角、光線還有行人被遮蔽的問題,不同的人也擁有相似的外表、衣著與身形也會對行人重識別造成很大的影響。基於上述提到的挑戰,這篇論文設計了一個能夠在應用在實際場域的人物重新識別深度學習模型,具有極高的準確度同時兼顧了模型的複雜度。
過往的監督式人物重新識別模型往往都在硬標籤的監督之下學習,在這種訓練模式下不同類別之間的相對關係往往會被忽略,本論文採用知識蒸餾的概念,將模型深層產生出來的類間關係做為用來訓練模型淺層的軟標籤,搭配設計的知識接收模組,使得深層網路所學到的知識能夠成功地傳遞到淺層網路,模型也針對捲積神經網路的短版整合了非局部注意機制模組,使得模型能夠抓取全局的視野,做更好的預測。
對於背景所造成的雜亂,這篇論文提出了一個新的加權池化方法,能夠聚合特徵圖上所有響應高的特徵並且抑制背景區域響應低的特徵,加權池化能夠改善平均池化與最大池化的缺點,有效的剔除對於判別人物無益的背景雜亂,也考量到了人物身上會具有多個具有判別性的特徵並加以整合。
為了證明此方法的有效性,我們在兩個具有挑戰性的公開資料集上進行了充足的實驗,實驗證明我們的方法表現優於大多數現今最新的方法。

In recent years, person re-identification (Re-ID) has raised lots of attention in the area of computer vision. It is because of the prevalence of deep learning and a wide range of applications including smart home, elderly care and surveillance systems. In general, Re-ID is challenged by background clutter, occlusion, different camera viewpoints and multiplehuman identities that with similar human appearances. The shape of a human body may look completely different from different viewpoints. Hence, tracking humans from distinct cameras remains a challenging problem. These factors hinder the process of extracting robust and discriminate representations.
Most of the fully-supervised person Re-ID models are trained by using one hot ground truth label. The informative inter-class relationships produced by the model itself are not fully exploited. It has been shown that the neural network with deeper layers can produce features that are more discriminative and informative than that with shallow layers. Hence, we propose a self-distilled framework that can distill knowledge of inter-class relationships within the network itself. The learned knowledge in the deeper portion of the networks can be transmitted into the shallow ones by the proposed Knowledge Receiver (RK). We also integrate the spatial non-local attention (SNLA) mechanism into the network to aggregate semantically similar pixels in the spatial domain. With the aid of SNLA, long-range (global) dependencies in feature maps can be captured.
For tackling the abovementioned background clutter issue, we also propose a new Re-weighted Average Pooling (RAP) that takes advantage of average pooling and max pooling. RAP can enlarge the difference of response value between salient points and unimportant regions and aggregates the salient pixels. It can spatially merge all the important points in a feature map for final prediction.
In the conducted experiment, it is shown that the proposed method in this thesis outperforms the state-of-the-art methods.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79072
DOI: 10.6342/NTU202100686
全文授權: 有償授權
電子全文公開日期: 2024-12-31
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
U0001-0802202119562700.pdf
  未授權公開取用
4.75 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved