使用多變數線性預測之人類動作辨識的時間模型

Chin-An Lin; 林晉安

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65912

Title:	使用多變數線性預測之人類動作辨識的時間模型 An Efficient Temporal Model for Action Recognition Using Multivariate Linear Prediction
Authors:	Chin-An Lin 林晉安
Advisor:	鄭士康(Shyh-Kang Jeng)
Co-Advisor:	林彥宇(Yen-Yu Lin)
Keyword:	人類動作辨識,多變數線性預測,時間模型,骨架,時間序列,影片描述, action recognition,multivariate linear prediction,temporal model,skeleton,time series,bag-of-word,video description,
Publication Year :	2012
Degree:	碩士
Abstract:	對於辨識長時間的動作，時間之間的高階依賴關係非常有用，但是利用高階依賴關係，將造成常用的圖形時間模型像是HMM或者是CRF的模型計算複雜度很大程度的提升。本論文提出使用多變數線性預測，來利用時間上高階依賴關係。我們時間模型的計算複雜度比較低。除此之外，相較於圖形時間模型，我們的演算法不用定義與人工標記狀態，並且在有一定程度雜訊的Bag-of-Word 表示法上可以改善辨識率，而這個表示法在前人的研究中得到顯著的成果。我們的方法也擁有很好的應用能力，為了顯示這個能力，我們不僅在視訊資料上像是KTH和UCF資料庫實驗，同時也在骨架資料上像是MSR、Kinect資料庫實驗。在大多數情況，我們得到相較於世界上最先進的系統更好的結果。 To recognize temporally extended actions, it is useful to introduce high-order temporal dependence into the recognition task. However, this will highly increase the computational complexity, when the commonly used graphical models such as HMM and CRF are employed. In this thesis, multivariate linear prediction is proposed to exploit high-order temporal dependence with lower computational complexity. In addition, our method makes no effort on defining and manually labeling states and can improve bag-of-word representations, which may contain considerable noise but has shown excellent performance in previous work. To show the applicability of the proposed method, we experiment not only on video datasets including KTH and UCF but on skeleton datasets such as MSR 3D action and UCF Kinect. In most of them, our method gets superior performance than the state-of-the-art methods.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65912
Fulltext Rights:	有償授權
Appears in Collections:	電信工程學研究所

Files in This Item:

File	Size	Format
ntu-101-1.pdf Restricted Access	1.85 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets