使用俯視深度攝影機之智慧家庭日常動作偵測系統

Tang-Wei Hsu; 許唐瑋

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49327

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	傅立成
dc.contributor.author	Tang-Wei Hsu	en
dc.contributor.author	許唐瑋	zh_TW
dc.date.accessioned	2021-06-15T11:23:50Z	-
dc.date.available	2019-10-26
dc.date.copyright	2016-10-26
dc.date.issued	2016
dc.date.submitted	2016-08-17
dc.identifier.citation	[1] S. C. Lin, A. S. Liu, T. W. Hsu, and L. C. Fu, 'Representative Body Points on Top-View Depth Sequences for Daily Activity Recognition,' in 2015 IEEE International conference on Systems, Man, and Cybernetics (SMC), 2015, pp.2968-2973. [2] T. E. Tseng, A. S. Liu, P. H. Hsiao, C. M. Huang, and L. C. Fu, 'Real-time people detection and tracking for indoor surveillance using multiple top-view depth cameras,' in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014, pp. 4077-4082. [3] H. Wang and C. Schmid, 'Action Recognition with Improved Trajectories,' in 2013 IEEE International Conference on Computer Vision, 2013, pp. 3551-3558. [4] C. Wang, Y. Wang, and A. L. Yuille, 'An Approach to Pose-Based Action Recognition,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013, pp. 915-922. [5] H. Wang, A. Kl, x00E, ser, C. Schmid, and C. L. Liu, 'Action recognition by dense trajectories,' in 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2011, pp. 3169-3176. [6] S. Cheema, A. Eweiwi, C. Thurau, and C. Bauckhage, 'Action recognition by learning discriminative key poses,' in 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011, pp. 1302-1309. [7] A. Yao, J. Gall, and L. V. Gool, 'A Hough transform-based voting framework for action recognition,' in 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 2061-2068. [8] F. Lv and R. Nevatia, 'Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching,' in 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1-8. [9] C. Chen, R. Jafari, and N. Kehtarnavaz, 'Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns,' in 2015 IEEE Winter Conference on Applications of Computer Vision, 2015, pp. 1092-1099. [10] H. Rahmani, A. Mahmood, D. Q Huynh, and A. Mian, 'HOPC: Histogram of Oriented Principal Components of 3D Pointclouds for Action Recognition,' in Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, proceedings, Part II, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ed Cham: Springer International Publishing, 2014, pp. 742-757. [11] C. Lu, J. Jia, and C. K. Tang, 'Range-Sample Depth Feature for Action Recognition,' in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 772-779. [12] J. K. Aggarwal and L. Xia, 'Human activity recognition from 3D data: A review,' Pattern Recognition Letters, vol. 48, pp. 70-80, 10/15/ 2014. [13] L. Xia and J. K. Aggarwal, 'Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2834-2841. [14] O. Oreifej and Z. Liu, 'HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 716-723. [15] L. Xia, C. C. Chen, and J. K. Aggarwal, 'View invariant human action recognition using histograms of 3D joints,' in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012, pp. 20-27. [16] J. Wang, Z. Liu, Y. Wu, and J. Yuan, 'Mining actionlet ensemble for action recognition with depth cameras,' in 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2012, pp. 1290-1297. [17] J. Wang, Z. Liu, J. Chorowski, Z. Chen, and Y. Wu, 'Robust 3D Action Recognition with Random Occupancy Patterns,' in Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part II, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 872-885. [18] A. W. Vieira, E. R. Nascimento, G. L. Oliveira, Z. Liu, and M. F. M. Campos, 'STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences,' in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: 17th Iberoamerican Congress, CIARP 2012, Buenos Aires, Argentina, September 3-6, 2012. Proceedings, L. Alvarez, M. Mejail, L. Gomez, and J. Jacobo, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 252-259. [19] L. Wang, Y. Qiao, and X. Tang, 'Video Action Detection with Relational Dynamic-Poselets,' in Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ed Cham: Springer International Publishing, 2014, pp. 565-580. [20] P. Banerjee and R. Nevatia, 'Pose Filter Based Hidden-CRF Models for Activity Detection,' in Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ed Cham: Springer International Publishing, 2014,pp. 711-726. [21] Y. Tian, R. Sukthankar, and M. Shah, 'Spatiotemporal Deformable Part Models for Action Detection,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2642-2649. [22] J. Yuan, Z. Liu, and Y. Wu, 'Discriminative Video Pattern Search for Efficient Action Detection,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, pp. 1728-1743, 2011. [23] Q. Luo, X. Kong, G. Zeng, and J. Fan, 'Human action detection via boosted local motion histograms,' Machine Vision and Applications, vol. 21, pp. 377-389, 2010. [24] L. Cao, Z. Liu, and T. S. Huang, 'Cross-dataset action detection,' in 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp.1998-2005. [25] C. Chen, K. Liu, and N. Kehtarnavaz, 'Real-time human action recognition based on depth motion maps,' Journal of Real-Time Image Processing, vol. 12, pp. 155-163, 2016. [26] G. Gkioxari and J. Malik, 'Finding action tubes,' in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 759-768. [27] M. Rauter, 'Reliable Human Detection and Tracking in Top-View Depth Images,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2013, pp. 529-534. [28] G. Hu, D. Reilly, B. Swinden, and Q. Gao, 'Human Activity Analysis in a 3D Bird’s-eye View,' in Image Analysis and Recognition: 11th International Conference, ICIAR 2014, Vilamoura, Portugal, October 22-24, 2014, Proceedings, Part II, A. Campilho and M. Kamel, Eds., ed Cham: Springer International Publishing, 2014, pp. 365-373. [29] G. Hu, D. Reilly, M. Alnusayri, B. Swinden, and Q. Gao, 'DT-DT: Top-down Human Activity Analysis for Interactive Surface Applications,' presented at the Proceedings of the Ninth ACM International Conference on Interactive Tabletops and Surfaces, Dresden, Germany, 2014. [30] O. Barnich and M. V. Droogenbroeck, 'ViBe: A Universal Background Subtraction Algorithm for Video Sequences,' IEEE Transactions on Image Processing, vol. 20, pp. 1709-1724, 2011. [31] O. Barnich and M. V. Droogenbroeck, 'ViBE: A powerful random technique to estimate the background in video sequences,' in 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, pp. 945-948. [32] B. E. Boser, I. M. Guyon, and V. N. Vapnik, 'A training algorithm for optimal margin classifiers,' presented at the Proceedings of the fifth annual workshop on Computational learning theory, Pittsburgh, Pennsylvania, USA, 1992. [33] C. Cortes and V. Vapnik, 'Support-Vector Networks,' Machine Learning, vol. 20, pp. 273-297, 1995. [34] C.-N. J. Yu and T. Joachims, 'Learning structural SVMs with latent variables,' presented at the Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Quebec, Canada, 2009. [35] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, 'Object Detection with Discriminatively Trained Part-Based Models,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 1627-1645, 2010. [36] N. Dalal and B. Triggs, 'Histograms of oriented gradients for human detection,' in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886-893 vol. 1. [37] L. Chen, H. Lin, and S. Li, 'Depth image enhancement for Kinect using region growing and bilateral filter,' in 2012 21st International Conference on Pattern Recognition (ICPR), 2012, pp. 3070-3073. [38] Y. L. Huang, T. W. Hsu, and S. Y. Chien, Edge-aware depth completion for pointcloud 3D scene visualization on an RGB-D camera,' in 2014 IEEE Conference on Visual Communications and Image Processing, 2014, pp. 422-425. [39] J. Liu, X. Gong, and J. Liu, 'Guided inpainting and filtering for Kinect depth maps,' in 2012 21st International Conference on Pattern Recognition (ICPR) 2012, pp. 2055-2058. [40] T. Q. Pham, 'Non-maximum Suppression Using Fewer than Two Comparisons per Pixel,' in Advanced Concepts for Intelligent Vision Systems: 12th International Conference, ACIVS 2010, Sydney, Australia, December 13-16, 2010, Proceedings, Part I, J. Blanc-Talon, D. Bone, W. Philips, D. Popescu, and P. Scheunders, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 438-451. [41] D. Comaniciu and P. Meer, 'Mean shift: a robust approach toward feature space analysis,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 603-619, 2002. [42] C.-C. Chang and C.-J. Lin, 'LIBSVM: A library for support vector machines,' ACM Trans. Intell. Syst. Technol., vol. 2, pp. 1-27, 2011. [43] N. Nguyen and Y. Guo, 'Comparisons of sequence labeling algorithms and extensions,' presented at the Proceedings of the 24th international conference on Machine learning, Corvalis, Oregon, USA, 2007. [44] A. L. Yuille and A. Rangarajan, 'The concave-convex procedure,' Neural Comput., vol. 15, pp. 915-936, 2003. [45] K. Zhang, L. Zhang, and M.-H. Yang, 'Real-time compressive tracking,' presented at the Proceedings of the 12th European conference on Computer Vision - Volume Part III, Florence, Italy, 2012. [46] T. W. Hsu, Y. H. Yang, T. H. Yeh, A. S. Liu, L. C. Fu, and Y. C. Zeng, 'Privacy Free Indoor Action Detection System Using Top-View Depth Camera based on Key-poses,' in 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2016, pp. 2968-2973.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49327	-
dc.description.abstract	在本論文中，我們提出一個新穎的室內動作偵測系統，用於自動記錄使用者日常生活的行為。我們使用俯視深度攝影機來實現動作偵測系統，這讓本系統比較沒有隱私問題且較不會對使用者的日常生活造成干擾。另外相較於傳統側向視角或是監視視角的彩色攝影機而言，我們的系統較不會受到遮蔽或是光線變化的影響。動作偵測的目的在於指出一段影像串流當中的哪裡和哪一個時間點有我們在意的動作發生。在本論文中，我們將一個動作拆解成一連串依照特定順序排列的重要姿勢。我們使用隱藏變量支持向量機建立某個動作的模型，在模型當中我們學習了重要姿勢的外觀和重要姿勢在時間中應該如何排列。我們使用兩種互補的特徵來描述姿勢。第一個是深度差距直方圖，我們使用這一個特徵描述姿勢的外觀；第二個是位置標誌特徵，這一個特徵描述了人與地板和其他非地板的物體，譬如：床或椅子間的空間關係。我們使用召回率─精準率曲線和平均精準度來衡量我們的動作偵測系統。實驗結果顯示我們系統的精準度和強韌性。	zh_TW
dc.description.abstract	In this thesis, we propose a novel indoor daily activity detection system which can automatically keep the log of users’ daily life. The hardware setting here adopts top-view depth cameras which makes our system less privacy sensitive and annoying to the users. Moreover, in contrast with the traditional setting using side-view or surveillance-view RGB camera, our camera setting could avoid the problems of occlusion and illuminance change. The goal of action detection is to identify where and when the actions of interest happened in a video stream. In this work, we regard the series of images of an action as a set of key-poses in images of the interested user which are arranged in a certain temporal order. To model an action, we use the latent SVM framework to jointly learn the appearance of the key-poses and the temporal locations of the key-poses. We use two kinds of features which are complementary to each other to describe postures of human. The first kind is the histogram of depth difference value which can encode the shape of the human poses. The second kind is the location-signified feature which can capture the spatial relations among the person of interest, floor and other non-floor objects, e.g., bed or chairs. We use recall-precision curve and average precision (AP) to validate the proposed daily activity detection system and the experimental results show the accuracy and robustness of our system.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T11:23:50Z (GMT). No. of bitstreams: 1 ntu-105-R03921002-1.pdf: 3745240 bytes, checksum: f82f6d10f4f66235b08c6b150591d3dc (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	誌謝 I 摘要 II ABSTRACT III TABLE OF CONTENTS IV LIST OF FIGURES VII LIST OF TABLES X Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Literature Review 2 1.2.1 Action Understanding 2 1.2.2 Action Recognition based on Depth Images 3 1.2.3 Action Detection 4 1.2.4 Applications based on Top-View Depth Camera 4 1.3 Contribution 6 1.4 Thesis Organization 7 Chapter 2 Preliminaries 8 2.1 ViBe Background Subtraction Algorithm 8 2.1.1 Introduction to ViBe Background Subtraction Algorithm 8 2.1.2 Pixel Model and Background Classification 9 2.1.3 Model Initialization 10 2.1.4 Model Updating 10 2.2 Support Vector Machine 12 2.2.1 Introduction to Support Vector Machine 12 2.2.2 Soft Margin SVM 14 2.2.3 Kernel Trick of SVM 15 2.3 Latent Support Vector Machine 15 2.3.1 Introduction to Latent SVM 16 2.3.2 Deformable Part Model 16 Chapter 3 Camera Setting and Image Preprocessing 20 3.1 Camera Setting 20 3.2 Preprocessing 21 3.2.1 Depth Image Inpainting 21 3.2.2 Background Subtraction 22 3.2.3 World Coordinate Mapping 26 Chapter 4 Human Detection and Tracking 30 4.1 Human Detection 30 4.1.1 Head Candidates Finding 31 4.1.2 Histogram of Depth Difference Feature and Head-Shoulder Classifier 32 4.2 Human Tracking 34 Chapter 5 Latent SVM for Action Model 37 5.1 The Action and Key-poses 37 5.2 Action Model Description 39 5.3 Feature extraction 40 5.4 Model Training 42 5.5 Online Action Detection 44 5.6 Action Suppression Mechanism 44 Chapter 6 Experiment 46 6.1 Description of Experimental Environment and Dataset 46 6.2 Human Tracking Results 49 6.3 Action Recognition Results 51 6.4 Action Detection Results 54 6.4.1 Action Suppression 54 6.4.2 Action Detection Result on Single Person Dataset 55 6.4.3 Action Detection Result on Multiple People Dataset 59 Chapter 7 Conclusion and Future Work 61 REFERENCE 63
dc.language.iso	en
dc.subject	隱藏變量支持向量機	zh_TW
dc.subject	動作偵測	zh_TW
dc.subject	重要姿勢	zh_TW
dc.subject	latent SVM	en
dc.subject	key-pose	en
dc.subject	activity detection	en
dc.title	使用俯視深度攝影機之智慧家庭日常動作偵測系統	zh_TW
dc.title	Daily Activity Detection System Using Top-View Depth Camera for Smart Home Environment	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳祝嵩,范欽雄,莊永裕,黃正民
dc.subject.keyword	動作偵測,重要姿勢,隱藏變量支持向量機,	zh_TW
dc.subject.keyword	activity detection,key-pose,latent SVM,	en
dc.relation.page	66
dc.identifier.doi	10.6342/NTU201602859
dc.rights.note	有償授權
dc.date.accepted	2016-08-18
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	3.66 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。