請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21489
標題: | 人物互動之區域檢測網路硬體導向演算法 Memory-Efficient Hardware-Friendly Algorithm of Region Proposal Network for Human-Object Interaction |
作者: | Chia-Ho Lin 林家禾 |
指導教授: | 陳良基(Liang-Gee Chen) |
關鍵字: | 即時動作辨識,區域檢測網絡,硬體導向演算法, real-time egocentric action recognition,region proposal network,hardware-friendly algorithm,architecture design, |
出版年 : | 2019 |
學位: | 碩士 |
摘要: | 近年來穿戴式裝置的發展催生了一系列電腦視覺相關應用,其中大多數涉及了如何有效識別人物動作及識別操縱物體的問題。另一方面,神經網絡加速器也促進了人物互動(Human-Object Interaction)系統的開發。然而針對人物互動之物件偵測模組的架構設計上相對較為缺乏,在應用層面上也會需要能夠在高解析度影片輸入下達成低功耗、即時輸出的模組。
本論文介紹了針對人物互動情境改良之的區域檢測網絡(Region Proposal Network)的硬體導向演算法設計。本論文的初衷為針對區域檢測網絡進行演算法的優化以符合未來穿戴式裝置的需求。首先,針對網絡的高記憶體用量,我們利用了特徵值層稀疏的特性設計了零係數跳躍機制及稀疏化卷積架構,達到大幅降低記憶體需求的效果。除此之外,針對網絡的高頻寬用量,我們設計了以塊(patch)為單位進行運算的池化層,並對初始預測框的數量進行簡化,達到降低頻寬需求的效果。我們是首位提出針對人物互動情境之區域檢測網絡進行硬體導向演算法設計的論文,所設計出來的晶片具有低記憶體需求、低頻寬用量,以及高吞吐量的特色,在未來能與深度類神經網路引擎結合,組成最先進的即時動作辨識系統。 Growth of wearable devices in recent years have facilitated a series of advanced applications in computer vision, which usually involves recognizing human activity and manipulated objects. The CNN-based accelerators proposed in recent years have facilitate the progress in human-object interaction system. However, lack of robust and flexible architecture for object detection makes it difficult to build a reliable system for egocentric applications. Besides, the detector requires real-time response under high-resolution videos with low power consumption to fit in applications in egocentric view. This thesis introduces a hardware-friendly algorithm and a hardware architecture for Region Proposal Network as video representations. The Region Proposal Network is a light-weight deep neural network for detection, which is easy to couple with network models of different size to balance between accuracy and complexity. Besides, it has been shown to benefit egocentric action recognition in recent works. Our goal is to achieve real-time computing for applications related to real-world egocentric action recognition. First, we propose a hardware-friendly algorithm, including anchor reduction and patch-based RoI pooling, to reduce resource requirement. Based on these techniques, we further design an architecture, including sparsity-aware convolution and channel-wise zero-skipping, to achieve high throughput, low memory cost and low bandwidth. We are the first to propose a hardware-friendly algorithm and architecture for Region Proposal Network. With the chip's small area, low memory cost, and high throughput, it can be applied to many applications on mobile devices, or combined with deep neural network to implement state-of-the-art systems for real-time action recognition. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21489 |
DOI: | 10.6342/NTU201902212 |
全文授權: | 未授權 |
顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 10.25 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。