Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65912
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor鄭士康(Shyh-Kang Jeng)
dc.contributor.authorChin-An Linen
dc.contributor.author林晉安zh_TW
dc.date.accessioned2021-06-17T00:15:09Z-
dc.date.available2012-08-01
dc.date.copyright2012-08-01
dc.date.issued2012
dc.date.submitted2012-07-04
dc.identifier.citation[1] H. Wang, A. Klaser, C. Schmid, and C.L. Liu, “Action recognition by dense trajectories,” in Proc. Conf. Computer Vision and Pattern Recognition, 2011, pp. 3169-3176.
[2] C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: a local SVM approach,” in Proc. Int'l Conf. Pattern Recognition, 2004, pp. 32-36.
[3] J. Liu, J. Luo, M. Shah, “Recognizing realistic actions from videos “in the wild”,” in Proc. Conf. Computer Vision and Pattern Recognition, 2009, pp. 1996-2003.
[4] I. Laptev and P. perez, “Retrieving actions in movies,” in Proc. Int'l Conf. Computer Vision, 2007, pp. 1-8.
[5] J. Carlos, H. Wang, and F.-F. Li, “Unsupervised learning of human action categories using spatial-temporal words,” Int'l J. Computer Vision, vol. 79, no. 3, pp. 299-318, 2008.
[6] P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie, “Behavior recognition via sparse spatio-temporal features,” Proc. 2nd Joint IEEE Int.l Workshop Visual Surveillance Perform. Eval. Tracking Surveillance, 2005, pp.65.
[7] H. Wang, M.M. Ullah, A. Klaser, I. Laptev, C. Schmid, ”Evaluation of local spatio-temporal features for action recognition,” in Proc. British Conf. Machine Vision, 2009, pp.1-8.
[8] A.F. Bobick, J.W. Davis, “The recognition of human movement using temporal templates,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp.257-267, 2001.
[9] A. Klaser, M. Marszałek, and C. Schmid, “A spatio-temporal descriptor based on 3D gradients,” in Proc. British Conf. Machine Vision, 2008.
[10] L. Wang and D. Suter, “Recognizing human activities from silhouettes: motion subspace and factorial discriminative graphical model,” in Proc. Conf. Computer Vision and Pattern Recognition, 2007, pp.1-8.
[11] C.C. Chen and J.K. Aggarwal, “Modeling human activities as speech,” in Proc. Conf. Computer Vision and Pattern Recognition, 2011, pp. 3425-3431.
[12] J. Zhang and S. Gong, “Action categorization with modified hidden conditional random field,” Pattern Recognition, vol. 43, no. 1, pp. 197-203, 2010.
[13] C.C. Loy, T. Xiang, and S. Gong, ”Modelling activity global temporal dependencies using Time Delayed Probabilistic Graphical Model,” in Proc. Int'l Conf. Computer Vision, 2009, pp. 120-127.
[14] R. Poppe, “A survey on vision-based human action recognition,” Image and Vision Computing, vol. 28, no. 6, pp. 976-990, 2010.
[15] P. Turaga, R. Chellappa, V.S. Subrahmanian, “Machine recognition of human activities: a survey,” IEEE Trans. Circuits and Systems for Video Technology, vol. 18, no. 11, pp. 1473-1488, 2008.
[16] P. Matikainen, M. Hebert and R. Sukthankar, “Representing pairwise spatial and temporal relations for action recognition,” in Proc. European Conf. Computer Vision, 2010, pp. 508-521.
[17] J. Sun, X. Wu, S. Yan, L.F. Cheong, T.S. Chua, and J. Li, ”Hierarchical spatio-temporal context modeling for action recognition,” in Proc. Conf. Computer Vision and Pattern Recognition, 2009, pp. 2004-2011..
[18] K. Prabhaka, S. Oh, P. Wang, G. D. Abowd, and J. M. Rehg, “Temporal causality for the analysis of visual events,” in Proc. Conf. Computer Vision and Pattern Recognition, 2010, pp. 1967-1974.
[19] Q.V. Le, W.T. Zou, S.T. Teung, and A.Y. Ng, “Learning hierarchical invariant sptio-temporal features for action recognition with independent subspace analysis,” in Proc. Conf. Computer Vision and Pattern Recognition, 2011, pp. 3161-3368.
[20] J. Makhoul, 'Linear prediction: A tutorial review,' Proceedings of the IEEE, vol. 63, no. 6, pp. 561–580, 1975.
[21] I. Laptev and T. Linderberg, “Space-time interest points,” in Proc. Int’l Conf. Computer Vision, 2003, pp. 432-439.
[22] M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” in Proc. Int’l Conf. Computer Vision, 2005, pp.1395-1402.
[23] M. Rodriguez, J. Ahmed, and M. Shah, “Action mach: A spatio-temporal maximum average correlation height filter for action recognition” in Proc. Conf. Computer Vision and Pattern Recognition, 2008, pp. 1-8.
[24] X. Liu, and J. Zhang, “Active learning for human action recognition with Gaussian process,” in Proc. Int'l Conf. Image Processing, 2011, pp.3253-3256.
[25] I. Junejo, E. Dexter, I. Laptev, P. Perez, “Cross-view action recognition from temporal self-similarities,” in Proc. European Conf. Computer Vision, 2008, pp. 293-306.
[26] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001, Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[27] A. Fathi, A. Farhadi, J.M. Rehg, “Understanding Egocentric Activities,” in Proc. Int'l Conf. Computer Vision, 2011, pp. 407-414.
[28] C. Sminchisescu, A. Kanaujia, D. Metaxas, “Conditional models for contextual human motion recognition,” Computer Vision and Image Understanding, vol. 104, no. 2-3, pp. 210-220, 2006.
[29] Y. Shen and H. Foroosh, “View-invariant action recognition using fundamental ratios,” in Proc. Conf. Computer Vision and Pattern Recognition, 2008, pp. 1-6.
[30] N.P. Cuntoor and R. Chellappa, “Epitomic representation of human activities,” in Proc. Conf. Computer Vision and Pattern Recognition, 2007, pp. 1-8.
[31] R. poppe, “Vision-based human motion analysis: an overview, ”Computer Vision and Image Understanding, vol. 108, no. 1-2, pp. 4-18, 2007.
[32] J. Wang, Z. Chen, and Y. Wu, “Action recognition with multiscale spatio-temporal contexts,” in Proc. Conf. Computer Vision and Pattern Recognition, 2011, pp. 3185-3192.
[33] B. Yao and F.-F. Li, “Modeling mutual context of object and human pose in human-object interaction activities,” in Proc. Conf. Computer Vision and Pattern Recognition, 2010, pp. 17-24.
[34] S.Z. Masood, C. Ellis, A. Nagaraja, M.F. Tappen, J.J. La Viola Jr., “Measuring and reducing observational latency when recognizing actions,” in Workshops Int.l Conf. Computer Vision, 2011, pp. 422-429.
[35] A. Kovashka and K. Grauman, “Learning a hierarchy of discriminative space-time neighborhood features for human action recognition,” in Proc. Conf. Computer Vision and Pattern Recognition, 2010, pp. 2046- 2053.
[36] W. Li, “Action recognition based on a bag of 3D points, ” in workshops Computer Vision and Pattern Recognition, 2010, pp. 9-14.
[37] M. Andriluka, S. Roth, B. Schiele, 'Pictorial structures revisited: People detection and articulated pose estimation,' in Proc. Conf. Computer Vision and Pattern Recognition, 2009, pp. 1014-1021.
[38] A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet, “SimpleMKL,” Journal of Machine Learning Research, vol.9, pp. 2491–2521, 2008.
[39] D. Weinland, R. Ronfard, E. Boyer, “Free viewpoint action recognition using motion history volumes,” Computer Vision and Image Understanding, vol. 104, no. 2-3, pp. 249-257, 2006.
[40] N. Dalal, B. Triggs, and C. Schmid, ”Human detection using oriented histograms of flow and appearance,” in Proc. European Conf. Computer Vision, 2006, pp. 428-441.
[41] B. Li, M. Ayazoglu, T. Mao, O.I. Camps, and M. Sznaier, “Activity recognition using dynamic subspace angles,” In Proc. Conf. Computer Vision and Pattern Recognition, 2011, pp. 3193-3200.
[42] S. Sonnenburg, G. Ratsch, C. Schafer, B. Scholkopf, “Large scale multiple kernel learning,” J. Machine Learning Research, vol. 7, pp. 1531-1565, 2006.
[43] G. Willems, T. Tuytelaars, and L.V. Gool, “An efficient dense and scale-invariant spatio-temporal interest point detector,” In Proc. European Conf. Computer Vision, 2008, pp. 650-663.
[44] N. Ikizler-Cinbis, and S. Sclaroff, “Object, scene and actions: combining multiple features for human action recognition,” In Proc. European Conf. Computer Vision, 2010, pp. 494-507.
[45] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, “Learning realistic human actions from movies,” In Proc. Conf. Computer Vision and Pattern Recognition, 2008, pp. 1-8.
[46] M. Marszałek, I. Laptev, and C. Schmid, “Actions in context,” In Proc. Conf. Computer Vision and Pattern Recognition, 2009, pp. 2929-2936.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65912-
dc.description.abstract對於辨識長時間的動作,時間之間的高階依賴關係非常有用,但是利用高階依賴關係,將造成常用的圖形時間模型像是HMM或者是CRF的模型計算複雜度很大程度的提升。本論文提出使用多變數線性預測,來利用時間上高階依賴關係。我們時間模型的計算複雜度比較低。除此之外,相較於圖形時間模型,我們的演算法不用定義與人工標記狀態,並且在有一定程度雜訊的Bag-of-Word 表示法上可以改善辨識率,而這個表示法在前人的研究中得到顯著的成果。我們的方法也擁有很好的應用能力,為了顯示這個能力,我們不僅在視訊資料上像是KTH和UCF資料庫實驗,同時也在骨架資料上像是MSR、Kinect資料庫實驗。在大多數情況,我們得到相較於世界上最先進的系統更好的結果。zh_TW
dc.description.abstractTo recognize temporally extended actions, it is useful to introduce high-order temporal dependence into the recognition task. However, this will highly increase the computational complexity, when the commonly used graphical models such as HMM and CRF are employed. In this thesis, multivariate linear prediction is proposed to exploit high-order temporal dependence with lower computational complexity. In addition, our method makes no effort on defining and manually labeling states and can improve bag-of-word representations, which may contain considerable noise but has shown excellent performance in previous work. To show the applicability of the proposed method, we experiment not only on video datasets including KTH and UCF but on skeleton datasets such as MSR 3D action and UCF Kinect. In most of them, our method gets superior performance than the state-of-the-art methods.en
dc.description.provenanceMade available in DSpace on 2021-06-17T00:15:09Z (GMT). No. of bitstreams: 1
ntu-101-R98942116-1.pdf: 1890371 bytes, checksum: 701477512c780c3c054dc24dd81d111d (MD5)
Previous issue date: 2012
en
dc.description.tableofcontents口試委員會審定書 #
誌謝 i
中文摘要 ii
ABSTRACT iii
CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES vii
Chapter 1 Introduction 1
1.1 Our Approach 3
1.2 Research Contributions 3
1.3 Thesis Organization 4
Chapter 2 Related Work 5
2.1 Global Representation 5
2.2 Local Representation 5
2.2.1 Spatio-Temporal Interest Point Detectors 6
2.2.2 Local Descriptors 7
2.2.3 Correlations between Local Descriptors 8
2.3 Parametric Model 9
Chapter 3 Preliminary 11
3.1 Wide-Sense Stationary Process 11
3.2 Linear Prediction 11
3.3 Multiple Kernel Learning 12
Chapter 4 The Proposed Framework 14
4.1 A Generative Formulation 14
4.2 Modeling the Frequency Component 15
4.2.1 Multivariate Linear Prediction 15
4.2.2 Computing Coefficient Matrix 16
4.2.3 Lambda Selection 19
4.2.4 Probability Model 19
4.3 Modeling the Static Component 20
4.4 Fusion 21
Chapter 5 Representations 23
5.1 Bag-of-Word Representation 23
5.1.1 STIP Detection and Local Description 23
5.1.2 Descriptor Quantization and Time-Series Representation 25
5.2 Skeleton Representation 25
Chapter 6 Experiments 27
6.1 Datasets 27
6.2 Kernels for MKL 29
6.3 Channel Combination 30
Chapter 7 Experimental Results 32
7.1 Evaluation of the proposed method 32
7.2 Evaluation of High-Order Temporal Dependence 34
Chapter 8 Conclusions 37
REFERENCES 38
dc.language.isoen
dc.subject人類動作辨識zh_TW
dc.subject多變數線性預測zh_TW
dc.subject時間模型zh_TW
dc.subject骨架zh_TW
dc.subject時間序列zh_TW
dc.subject影片描述zh_TW
dc.subjectbag-of-worden
dc.subjectaction recognitionen
dc.subjectmultivariate linear predictionen
dc.subjectskeletonen
dc.subjecttime seriesen
dc.subjectvideo descriptionen
dc.subjecttemporal modelen
dc.title使用多變數線性預測之人類動作辨識的時間模型zh_TW
dc.titleAn Efficient Temporal Model for Action Recognition Using Multivariate Linear Predictionen
dc.typeThesis
dc.date.schoolyear100-2
dc.description.degree碩士
dc.contributor.coadvisor林彥宇(Yen-Yu Lin)
dc.contributor.oralexamcommittee廖弘源(Hong-Yuan Mark Liao)
dc.subject.keyword人類動作辨識,多變數線性預測,時間模型,骨架,時間序列,影片描述,zh_TW
dc.subject.keywordaction recognition,multivariate linear prediction,temporal model,skeleton,time series,bag-of-word,video description,en
dc.relation.page42
dc.rights.note有償授權
dc.date.accepted2012-07-05
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-101-1.pdf
  未授權公開取用
1.85 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved