請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8254
標題: | 以對抗式物件合成訓練輔助動作辨識模型以減輕其偏差 Towards Robust Action Recognition via Adversarial Object Synthesis Training |
作者: | Zhe-Yu Liu 劉哲宇 |
指導教授: | 徐宏民(Winston H. Hsu whsu@ntu.edu.tw ) |
關鍵字: | 動作辨識,資料集偏差,對抗式訓練, Action Recognition,Dataset Bias,Adversarial Training, |
出版年 : | 2020 |
學位: | 碩士 |
摘要: | 一個好的動作辨識模型需要對人類或其他物體的移動模式有良好的了解。然而我們發現,即使是許多目前表現最好的模型,都會一定程度的利用周遭環境中的靜止物件來判斷當下發生的動作,而非使用該動作本身當作判斷依據。這種對周遭特定物件的依賴性,使得模型在應用到擁有不同物件分佈的環境中時,無法維持原來的表現,因為許多動作像是「拿取」,不會跟固定的物件做連結。 在此篇論文中,我們將上述問題稱為物件謬誤依賴 (Fallacious Object Reliance, or FOR),並且詳盡地討論了關於物件特徵偏差(object representation bias)在許多動作辨識資料集中造成的影響。我們提出了數個量化方法來測量 FOR 問題的嚴重性,並且提出了一個「對抗式物件合成訓練(AdvOST)」的方法來減輕模型的 FOR 問題。 AdvOST 方法訓練了一個神經網路合成器,把各種物件的圖片合成到訓練資料集的影片中,並且該合成器在需要混淆動作辨識模型的同時做出合理的生成來通過另一個神經網路鑑別器的偵測。此方法驅使動作辨識模型去忽略無關的靜止物件線索,以此減輕 FOR 問題。我們的實驗發現 AdvOST 方法可幫助 I3D 跟 SlowFast 等頂尖的動作辨識模型在 EPIC-KITCHENS 與 HMDB51 資料集上獲得更好的表現。 The action recognition task requires agents to understand the motion performed by humans or objects. However, we found that recognition models tend to predict the action based on the surrounding static objects instead of the action itself. This dependency may hurt the robustness of such models when applied to new environments with different object distribution as many action classes could be associated with different subjects (e.g. 'take'). In this paper, we regard this problem as the \textit{Fallacious Object Reliance (FOR)} issue and discuss the role that the object representation bias plays in different datasets. Based on the observation, we propose several metrics to measure the severity of the FOR issue. Moreover, we propose a new training procedure called Adversarial Object Synthesis Training (AdvOST) to mitigate this issue. AdvOST trains a synthesizer pasting objects onto training videos to obfuscate the classification model and uses a discriminator that regularizes the synthesizer to generate natural synthesis. This method forces the action model to ignore unrelated object clues and successfully reduces the FOR issue. Finally, we obtain decent accuracy improvement on the validation sets of the EPIC-KITCHENS using the state-of-the-art I3D and SlowFast after applied AdvOST. We also acquire consistently accurate improvement on the three splits of HMDB51 using I3D. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8254 |
DOI: | 10.6342/NTU202003008 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-1108202020115200.pdf | 5.63 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。