Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51404
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪一平(Yi-Ping Hung)
dc.contributor.authorShih-Yao Linen
dc.contributor.author林士堯zh_TW
dc.date.accessioned2021-06-15T13:33:04Z-
dc.date.available2019-03-08
dc.date.copyright2016-03-08
dc.date.issued2016
dc.date.submitted2016-02-01
dc.identifier.citation[1] Yu Kong, Yunde Jia, and Yun Fu. Learning human interaction by interactive phrases. In Proc. Euro. Conf. Computer Vision, pages 300–313, 2012.
[2] Nataliya Shapovalova, Arash Vahdat, Kevin Cannons, Tian Lan, and Greg Mori. Similarity constrained latent support vector machine: An application to
weakly supervised action classification. In Proc. Euro. Conf. Computer Vision, pages 55–68, 2012.
[3] Qinfeng Shi, Li Cheng, Li Wang, and Alex Smola. Human action segmentation and recognition using discriminative semi-markov models. Int. J. Computer
Vision, 93(1):22–32, 2011.
[4] Raviteja Vemulapalli, Felipe Arrate, and Rama Chellappa. Human action recognition by representing 3d skeletons as points in a lie group. In Proc. Conf.
Computer Vision and Pattern Recognition, pages 588–595, 2014.
[5] Xinxiao Wu, Dong Xu, Lixin Duan, Jiebo Luo, and Yunde Jia. Action recognition using multilevel features and latent structural svm. IEEE Trans. on
Circuits and Systems for Video Technology, 23(8):1422–1431, 2013.
[6] Jake K Aggarwal and Michael S Ryoo. Human activity analysis: A review. ACM Computing Survays, 43(3):16, 2011.
[7] Wanqing Li, Zhengyou Zhang, and Zicheng Liu. Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans. on
Circuits and Systems for Video Technology, 18(11):1499–1510, 2008.
[8] Angela Yao, Juergen Gall, Gabriele Fanelli, and Luc J Van Gool. Does human action recognition benefit from pose estimation?. volume 3, page 6, 2011.
[9] Yale Song, Louis-Philippe Morency, and Randall Davis. Multi-view latent variable discriminative models for action recognition. In Proc. Conf. Computer
Vision and Pattern Recognition, pages 2120–2127, 2012.
[10] Yale Song, Louis-Philippe Morency, and Ronald W Davis. Action recognition by hierarchical sequence summarization. In Proc. Conf. Computer Vision and
Pattern Recognition, pages 3562–3569, 2013.
[11] Jaeyong Sung, Colin Ponce, Bart Selman, and Ashutosh Saxena. Unstructured human activity detection from rgbd images. In Proc. Int’ Conf. on Robotics
and Automation, pages 842–849, 2012.
[12] Ariadna Quattoni, Sybor Wang, Louis-Philippe Morency, Michael Collins, and Trevor Darrell. Hidden conditional random fields. IEEE Trans. on Pattern
Analysis and Machine Intelligence, pages 1848–1852, 2007.
[13] Kristen Grauman and Trevor Darrell. The pyramid match kernel: Discriminative classification with sets of image features. In Proc. Int’ Conf. Computer
Vision, volume 2, pages 1458–1465, 2005.
[14] Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc.
Conf. Computer Vision and Pattern Recognition, volume 2, pages 2169–2178, 2006.
[15] Tian Lan, Tsung-Chuan Chen, and Silvio Savarese. A hierarchical representation for future action prediction. In Proc. Euro. Conf. Computer Vision, pages
689–704, 2014.
[16] Juan Carlos Niebles, Chih-Wei Chen, and Li Fei-Fei. Modeling temporal structure of decomposable motion segments for activity classification. In Proc. Euro.
Conf. Computer Vision, pages 392–405. 2010.
[17] Jiang Wang, Zicheng Liu, Ying Wu, and Junsong Yuan. Mining actionlet ensemble for action recognition with depth cameras. In Proc. Conf. Computer
Vision and Pattern Recognition, pages 1290–1297, 2012.
[18] Jiang Wang, Zicheng Liu, Jan Chorowski, Zhuoyuan Chen, and Ying Wu. Robust 3d action recognition with random occupancy patterns. In Proc. Euro.
Conf. Computer Vision, pages 872–885, 2012.
[19] Mohammad A Gowayyed, Marwan Torki, Mohamed E Hussein, and Motaz El-Saban. Histogram of oriented displacements (hod): describing trajectories
of human joints for action recognition. In Proc. Int’ Joint Conf. on Artificial Intelligence, pages 1351–1357, 2013.
[20] Adriana Kovashka and Kristen Grauman. Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In Proc. Conf.
Computer Vision and Pattern Recognition, pages 2046–2053, 2010.
[21] Jiang Wang, Zhuoyuan Chen, and Ying Wu. Action recognition with multiscale spatio-temporal contexts. In Proc. Conf. Computer Vision and Pattern
Recognition, pages 3185–3192, 2011.
[22] Ivan Laptev. On space-time interest points. Int. J. Computer Vision, 64(2-3): 107–123, 2005.
[23] Yong Du, Wei Wang, and Liang Wang. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. Conf. Computer Vision and
Pattern Recognition, pages 1110–1118, 2015.
[24] Yu Kong and Yun Fu. Bilinear heterogeneous information machine for rgb-d action recognition. In Proc. Conf. Computer Vision and Pattern Recognition,
pages 1054–1062, 2015.
[25] Fengjun Lv and Ramakant Nevatia. Recognition and segmentation of 3-d human action using hmm and multi-class adaboost. In Proc. Euro. Conf. Computer
Vision, pages 359–372, 2006.
[26] Xiaohan Nie, Caiming Xiong, and Song-Chun Zhu. Joint action recognition and pose estimation from video. In Proc. Conf. Computer Vision and Pattern
Recognition, pages 1293–1301, 2015.
[27] Lei Zhang, Zhi Zeng, and Qiang Ji. Probabilistic image modeling with an extended chain graph for human activity recognition and image segmentation.
IEEE Trans. on Image Processing, 20(9):2401–2413, 2011.
[28] Olusegun Oshin, Andrew Gilbert, and Richard Bowden. Capturing the relative distribution of features for action recognition. In Proc. Conf. Automatic Face
and Gesture Recognition, pages 111–116, 2011.
[29] Alexandros Andre Chaaraoui, José Ramón Padilla-López, and Francisco Flórez-Revuelta. Fusion of skeletal and silhouette-based features for human action
recognition with rgb-d devices. In Proc. Int’ Conf. Computer Vision Workshops, pages 91–97, 2013.
[30] Alper Ayvaci, Michalis Raptis, and Stefano Soatto. Sparse occlusion detection with optical flow. Int. J. Computer Vision, 97(3):322–338, 2012.
[31] Xiaoyu Wang, Tony X Han, and Shuicheng Yan. An hog-lbp human detector with partial occlusion handling. In Proc. Int’ Conf. Computer Vision, pages
32–39, 2009.
[32] Guang Shu, Afshin Dehghan, Omar Oreifej, Emily Hand, and Mubarak Shah. Part-based multiple-person tracking with partial occlusion handling. In Proc.
Conf. Computer Vision and Pattern Recognition, pages 1815–1821, 2012.
[33] Wei Shen, Ke Deng, Xiang Bai, Tommer Leyvand, Baining Guo, and Zhuowen Tu. Exemplar-based human action pose correction and tagging. In Proc. Conf.
Computer Vision and Pattern Recognition, pages 1784–1791, 2012.
[34] Daniel Weinland, Mustafa Özuysal, and Pascal Fua. Making action recognition robust to occlusions and viewpoint changes. In Proc. Euro. Conf. Computer
Vision, pages 635–648, 2010.
[35] Yu Cao, Daniel Barrett, Andrei Barbu, Swaminathan Narayanaswamy, Haonan Yu, Aaron Michaux, Yuewei Lin, Sven Dickinson, Jeffrey Mark Siskind, and
Song Wang. Recognize human activities from partially observed videos. In Proc. Conf. Computer Vision and Pattern Recognition, pages 2658–2665, 2013.
[36] Subhransu Maji, Lubomir Bourdev, and Jitendra Malik. Action recognition from a distributed representation of pose and appearance. In Proc. Conf. Computer
Vision and Pattern Recognition, pages 3177–3184, 2011.
[37] Liang Wang and David Suter. Recognizing human activities from silhouettes: Motion subspace and factorial discriminative graphical model. In Proc. Conf.
Computer Vision and Pattern Recognition, pages 1–8, 2007.
[38] Chia-Chih Chen and JK Aggarwal. Modeling human activities as speech. In Proc. Conf. Computer Vision and Pattern Recognition, pages 3425–3432, 2011.
[39] Jamie Shotton, Ross Girshick, Andrew Fitzgibbon, Toby Sharp, Mat Cook, Mark Finocchio, Richard Moore, Pushmeet Kohli, Antonio Criminisi, Alex Kipman,
et al. Efficient human pose estimation from single depth images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 35(12):2821–2840, 2013.
[40] Kevin Tang, Li Fei-Fei, and Daphne Koller. Learning latent temporal structure for complex event detection. In Proc. Conf. Computer Vision and Pattern
Recognition, pages 1250–1257, 2012.
[41] MS Ryoo. Human activity prediction: Early recognition of ongoing activities from streaming videos. In Proc. Int’ Conf. Computer Vision, pages 1036–1043,
2011.
[42] Kai-Yueh Chang, Tyng-Luh Liu, and Shang-Hong Lai. Learning partiallyobserved hidden conditional random fields for facial expression recognition. In
Proc. Conf. Computer Vision and Pattern Recognition, pages 533–540, 2009.
[43] James W Davis and Ambrish Tyagi. Minimal-latency human action recognition using reliable-inference. Image and Vision Computing, 24(5):455–472, 2006.
[44] Minh Hoai and Fernando De la Torre. Max-margin early event detectors. Int. J. Computer Vision, 107(2):191–202, 2014.
[45] Yun Jiang and Ashutosh Saxena. Modeling high-dimensional humans for activity anticipation using gaussian process latent crfs. In Robotics: Science and
Systems, 2014.
[46] Kang Li and Yun Fu. Prediction of human activity by discovering temporal sequence patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence,
36(8):1644–1657, 2014.
[47] Michalis Raptis and Leonid Sigal. Poselet key-framing: A model for human activity recognition. In Proc. Conf. Computer Vision and Pattern Recognition,
pages 2650–2657, 2013.
[48] Konrad Schindler and Luc Van Gool. Action snippets: How many frames does human action recognition require? In Proc. Conf. Computer Vision and Pattern
Recognition, pages 1–8, 2008.
[49] Gang Yu, Junsong Yuan, and Zicheng Liu. Predicting human activities using spatio-temporal structure of interest points. In Proc. ACM Conf. Multimedia,
pages 1049–1052, 2012.
[50] Gang Yu, Junsong Yuan, and Zicheng Liu. Propagative hough voting for human activity detection and recognition. IEEE Trans. on Circuits and Systems for
Video Technology, 25(1):87–98, 2015.
[51] Jiang Wang, Zicheng Liu, Ying Wu, and Junsong Yuan. Learning actionlet ensemble for 3d human action recognition. IEEE Trans. on Pattern Analysis
and Machine Intelligence, 36(5):914–927, 2014.
[52] Prithviraj Banerjee and Ram Nevatia. Pose filter based hidden-crf models for activity detection. In Proc. Euro. Conf. Computer Vision, pages 711–726, 2014.
[53] Wanqing Li, Zhengyou Zhang, and Zicheng Liu. Action recognition based on a bag of 3d points. In Proc. Int’ Conf. Computer Vision Workshops, pages 9–14,
2010.
[54] C. Sutton and A. McCallum. An Introduction to Conditional Random Fields for Relational Learning. MIT Press, 2007.
[55] Piotr Dollár, Vincent Rabaud, Garrison Cottrell, and Serge Belongie. Behavior recognition via sparse spatio-temporal features. In Proc. Int’ Workshops on
Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pages 65–72, 2005.
[56] Lasitha Piyathilaka and Sarath Kodagoda. Gaussian mixture based hmm for human daily activity recognition using 3d skeleton features. In Proc. Int’ Conf.
on Industrial Electronics and Applications, pages 567–572, 2013.
[57] James Martens and Ilya Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proc. Int’ Conf. Machine Learning, pages 1033–
1040, 2011.
[58] Meinard Müller and Tido Röder. Motion templates for automatic classification and retrieval of motion capture data. In Proc. ACM SIGGRAPH/Eurographics
Symp. on Computer Animation, pages 137–146, 2006.
[59] Lu Xia, Chia-Chih Chen, and JK Aggarwal. View invariant human action recognition using histograms of 3d joints. In Proc. Int’ Conf. Computer Vision
Workshops, 2012.
[60] Xiaodong Yang, Chenyang Zhang, and YingLi Tian. Recognizing actions using depth motion maps-based histograms of oriented gradients. In Proc. ACM
Conf. Multimedia, pages 1057–1060. ACM, 2012.
[61] Matthew Brand, Nuria Oliver, and Alex Pentland. Coupled hidden markov models for complex action recognition. In Proc. Conf. Computer Vision and
Pattern Recognition, pages 994–999, 1997.
[62] Anwaar-ul Haq, Iqbal Gondal, and Manzur Murshed. On temporal order invariance for view-invariant action recognition. IEEE Trans. on Circuits and
Systems for Video Technology, 23(2):203–211, 2013.
[63] Ivan Laptev, Marcin Marszalek, Cordelia Schmid, and Benjamin Rozenfeld. Learning realistic human actions from movies. In Proc. Conf. Computer Vision
and Pattern Recognition, pages 1–8, 2008.
[64] Christian Schüldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: a local svm approach. In Proc. Int’ Conf. Pattern Recognition, volume 3,
pages 32–36, 2004.
[65] Loren Arthur Schwarz, Diana Mateus, Victor Castañeda, and Nassir Navab. Manifold learning for tof-based human body tracking and activity recognition.
pages 1–11, 2010.
[66] Di Wu and Ling Shao. Silhouette analysis-based action recognition via exploiting human poses. IEEE Trans. on Circuits and Systems for Video Technology,
23(2):236–243, 2013.
[67] Louis-Philippe Morency, Ariadna Quattoni, and Trevor Darrell. Latentdynamic discriminative models for continuous gesture recognition. In Proc.
Conf. Computer Vision and Pattern Recognition, pages 1–8, 2007.
[68] M. S. Ryoo and J. K. Aggarwal. UT-Interaction Dataset, ICPR contest on Semantic Description of Human Activities (SDHA), 2010.
[69] James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proc. Berkeley Symp. on Mathematical Statistics and
Probability, volume 1, pages 281–297, 1967.
[70] Sy Bor Wang, Ariadna Quattoni, Louis-Philippe Morency, David Demirdjian, and Trevor Darrell. Hidden conditional random fields for gesture recognition. In
Proc. Conf. Computer Vision and Pattern Recognition, volume 2, pages 1521–1527, 2006.
[71] Omar Oreifej and Zicheng Liu. Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In Proc. Conf. Computer Vision and
Pattern Recognition, pages 716–723, 2013.
[72] Lu Xia and JK Aggarwal. Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In Proc. Conf. Computer Vision and
Pattern Recognition, pages 2834–2841, 2013.
[73] Simon Fothergill, Helena M. Mentis, Pushmeet Kohli, and Sebastian Nowozin. Instructing people for training gestural interactive systems. In Proc. Int’ Conf.
Human Factors in Computing Systems, pages 1737–1746, 2012.
[74] Li Fei-Fei and Pietro Perona. A bayesian hierarchical model for learning natural scene categories. In Proc. Conf. Computer Vision and Pattern Recognition,
volume 2, pages 524–531, 2005.
[75] Maxime Devanne, Hazem Wannous, Stefano Berretti, Pietro Pala, Meroua Daoudi, and Alberto Del Bimbo. 3-d human action recognition by shape analysis
of motion trajectories on riemannian manifold. IEEE Trans. on Cybernetics,45(7):1340–1352, 2015.
[76] Hongzhao Chen, Guijin Wang, and Li He. Accurate and real-time human action recognition based on 3d skeleton. In Proc. Int’ Conf. Optical Instruments and
Technology, pages 90451Q–90451Q, 2013.
[77] Alexandros Iosifidis, Anastasios Tefas, and Ioannis Pitas. Dynamic action classification
based on iterative data selection and feedforward neural networks. In Proc. Euro. Conf. on Signal Processing, pages 1–5, 2013.
[78] Jian Peng, Liefeng Bo, and Jinbo Xu. Conditional neural fields. In Proc. Advances in Neural Information Processing Systems, pages 1419–1427, 2009.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51404-
dc.description.abstractVideo-based human action recognition is a key component of video analysis. Despite significant research efforts over the past few decades, human action recognition still remains a challenging problem. My dissertation investigates three important and emerging topics in practical action recognition problems.
First, we consider a crucial problem of recognizing actions having similar movements. An approach called Action Trait Code (ATC) for human action classification is proposed to represent an action with a set of velocity types derived by the averages velocity of each body part. An effective graph model based on ATC is employed for learning and recognizing human actions. To examine recognition accuracy, we evaluate our approach on our self-collected action database.
Second, an action video may have many similar observations with occasional and irregular changes, which make commonly used fine-grained features less reliable.
This dissertation introduces a set of temporal pyramid features that enriches action representation with various levels of semantic granularities. For learning and inferring the proposed pyramid features, we adopt a discriminative model with latent variables to capture the hidden dynamics in each layer of the pyramid. Experimental results show that our method achieves more favorable performance than existing methods.
Third, actions taken in real-world applications often come with corrupt segments of frames caused by various factors. These segments may be of arbitrary lengths and appear irregularly. They change the appearance of actions dramatically, and hence degrade the performance of a pre-trained recognition system. In this dissertation, I presented an integrated approach which includes two key components {em outlier filtering} and {em observation completion}. It can tackle this problem without making any assumptions about the locations of the unobserved segments. Specifically, the outlier frame filtering mechanism is introduced to identify the unobserved frames. Our observation completion algorithm is designed to infer the unobserved parts. It treats the observed parts as the query to the training set, and retrieves coherent alternatives to replace the unobserved parts. Hidden conditional random fields (HCRFs) are then used to recognize the filtered and completed actions. Furthermore, we collect a new action dataset where outlier frames irregularly and naturally present. Besides this dataset, our approach is evaluated on two benchmark datasets, and compared with several state-of-the-art approaches. The superior results obtained by using our approach demonstrate its effectiveness and general applicability.
On the other hand, I also develop a unified action recognition framework that can jointly handle the outlier detection and predicting actions.
For each action to be recognized, we explore the mutual dependency between its frames, and augment each frame with extra alternative frames borrowed from training data. The augmentation mechanism is designed in the way where a few alternatives are of high quality, and can replace the detected corrupt frames. Our approach is developed upon hidden conditional random fields. It integrates corrupt frame detection and alternative selection into the process of prediction, and can more accurately recognize partially observed actions. The promising results manifest its effectiveness and large applicability.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T13:33:04Z (GMT). No. of bitstreams: 1
ntu-105-D00944001-1.pdf: 2536272 bytes, checksum: 9f0499b7f73287ee547e3d1adee30ef3 (MD5)
Previous issue date: 2016
en
dc.description.tableofcontentsAcknowledgments ii
Abstract iv
Contents viii
List of Figures xi
List of Tables xv
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work 4
2.1 Overview of Human Action Recognition . . . . . . . . . . . . . . . . 4
2.2 Human Action Representation . . . . . . . . . . . . . . . . . . . . . 5
2.3 Early Prediction and Gapfilling . . . . . . . . . . . . . . . . . . . . . 6
2.4 Partially Observed Human Action Recognition with HCRFs model . . 6
3 Fundamentals 8
3.1 Introduction to Probabilistic Graphical Models . . . . . . . . . . . . 8
3.2 Action Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Action Recognition with Hidden-State Conditional Random Fields
(HCRFs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3.1 Learning with HCRFs . . . . . . . . . . . . . . . . . . . . . . 11
4 Human Action Recognition Using Action Trait Code 14
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Human Action Recognition Using Action Trait Code . . . . . . . . . 15
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 Learning and Inferring Human Actions with Temporal Pyramid Features
based on Conditional Random Fields 22
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 The Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.1 Action Recognition with HCRFs . . . . . . . . . . . . . . . . 25
5.2.2 Temporal Pyramid Feature Representation . . . . . . . . . . . 27
5.2.3 Learning HCRFs with Temporal Pyramid Representation . . . 27
5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3.1 Datasets for Evaluation . . . . . . . . . . . . . . . . . . . . . 28
5.3.2 Feature Representation and Evaluation Metrics . . . . . . . . 30
5.3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 30
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6 Recognizing Partially Observed Human Actions by Observation Filtering and
Completion 33
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.1 Outlier frame filtering . . . . . . . . . . . . . . . . . . . . . . 37
6.2.2 Observation completion . . . . . . . . . . . . . . . . . . . . . 39
6.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.3.1 Datasets for performance evaluation . . . . . . . . . . . . . . 41
6.3.2 Feature representation . . . . . . . . . . . . . . . . . . . . . . 45
6.3.3 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.4.1 Early prediction . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.4.2 Gap-filling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.4.3 POAR with synthetic outlier frames . . . . . . . . . . . . . . 52
6.4.4 POAR with real outlier frames . . . . . . . . . . . . . . . . . 55
6.5 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 58
7 Learning Conditional Random Fields with Augmented Observations for Partially
Observed Action Recognition 60
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.2 Our Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2.1 Alternative augmentation . . . . . . . . . . . . . . . . . . . . 63
7.2.2 Learning HCRFs with augmented actions . . . . . . . . . . . 65
7.3 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.3.1 Datasets for evaluation . . . . . . . . . . . . . . . . . . . . . . 67
7.3.2 Feature presentation and evaluation metrics . . . . . . . . . . 70
7.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.4.1 Results on self-collected dataset . . . . . . . . . . . . . . . . . 71
7.4.2 Results on UT-Interaction database . . . . . . . . . . . . . . . 73
7.5 Early prediction on UT-Interaction dataset . . . . . . . . . . . . . . . 76
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8 Conclusion 80
Publications 82
Bibliography 85
dc.language.isoen
dc.subject條件隨機場zh_TW
dc.subject人體動作辨識zh_TW
dc.subject機率圖形模型zh_TW
dc.subject條件隨機場zh_TW
dc.subject人體動作辨識zh_TW
dc.subject機率圖形模型zh_TW
dc.subjectProbabilistic Graphical Modelsen
dc.subjectHuman Action Recognitionen
dc.subjectConditional Random Fieldsen
dc.subjectProbabilistic Graphical Modelsen
dc.subjectConditional Random Fieldsen
dc.subjectHuman Action Recognitionen
dc.title基於機率圖形模型之人體動作辨識zh_TW
dc.titleHuman Action Recognition based on Probabilistic Graphical
Models
en
dc.typeThesis
dc.date.schoolyear104-1
dc.description.degree博士
dc.contributor.coadvisor陳祝嵩(Chu-Song Chen),林彥宇(Yen-Yu Lin)
dc.contributor.oralexamcommittee王聖智(Sheng-Jyh Wang),陳煥宗(Hwann-Tzong Chen),莊永裕(Yung-Yu Chuang),徐繼聖(Gee-Sern Hsu)
dc.subject.keyword人體動作辨識,機率圖形模型,條件隨機場,zh_TW
dc.subject.keywordHuman Action Recognition,Probabilistic Graphical Models,Conditional Random Fields,en
dc.relation.page93
dc.rights.note有償授權
dc.date.accepted2016-02-02
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  未授權公開取用
2.48 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved