Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49327
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor傅立成
dc.contributor.authorTang-Wei Hsuen
dc.contributor.author許唐瑋zh_TW
dc.date.accessioned2021-06-15T11:23:50Z-
dc.date.available2019-10-26
dc.date.copyright2016-10-26
dc.date.issued2016
dc.date.submitted2016-08-17
dc.identifier.citation[1] S. C. Lin, A. S. Liu, T. W. Hsu, and L. C. Fu, 'Representative Body Points on Top-View Depth Sequences for Daily Activity Recognition,' in 2015 IEEE International conference on Systems, Man, and Cybernetics (SMC), 2015, pp.2968-2973.
[2] T. E. Tseng, A. S. Liu, P. H. Hsiao, C. M. Huang, and L. C. Fu, 'Real-time people detection and tracking for indoor surveillance using multiple top-view depth cameras,' in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014, pp. 4077-4082.
[3] H. Wang and C. Schmid, 'Action Recognition with Improved Trajectories,' in 2013 IEEE International Conference on Computer Vision, 2013, pp. 3551-3558.
[4] C. Wang, Y. Wang, and A. L. Yuille, 'An Approach to Pose-Based Action Recognition,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013, pp. 915-922.
[5] H. Wang, A. Kl, x00E, ser, C. Schmid, and C. L. Liu, 'Action recognition by dense trajectories,' in 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2011, pp. 3169-3176.
[6] S. Cheema, A. Eweiwi, C. Thurau, and C. Bauckhage, 'Action recognition by learning discriminative key poses,' in 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), 2011, pp. 1302-1309.
[7] A. Yao, J. Gall, and L. V. Gool, 'A Hough transform-based voting framework for action recognition,' in 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp. 2061-2068.
[8] F. Lv and R. Nevatia, 'Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching,' in 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1-8.
[9] C. Chen, R. Jafari, and N. Kehtarnavaz, 'Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns,' in 2015 IEEE Winter Conference on Applications of Computer Vision, 2015, pp. 1092-1099.
[10] H. Rahmani, A. Mahmood, D. Q Huynh, and A. Mian, 'HOPC: Histogram of Oriented Principal Components of 3D Pointclouds for Action Recognition,' in Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, proceedings, Part II, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ed Cham: Springer International Publishing, 2014, pp. 742-757.
[11] C. Lu, J. Jia, and C. K. Tang, 'Range-Sample Depth Feature for Action Recognition,' in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 772-779.
[12] J. K. Aggarwal and L. Xia, 'Human activity recognition from 3D data: A review,' Pattern Recognition Letters, vol. 48, pp. 70-80, 10/15/ 2014.
[13] L. Xia and J. K. Aggarwal, 'Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2834-2841.
[14] O. Oreifej and Z. Liu, 'HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 716-723.
[15] L. Xia, C. C. Chen, and J. K. Aggarwal, 'View invariant human action recognition using histograms of 3D joints,' in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012, pp. 20-27.
[16] J. Wang, Z. Liu, Y. Wu, and J. Yuan, 'Mining actionlet ensemble for action recognition with depth cameras,' in 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2012, pp. 1290-1297.
[17] J. Wang, Z. Liu, J. Chorowski, Z. Chen, and Y. Wu, 'Robust 3D Action Recognition with Random Occupancy Patterns,' in Computer Vision – ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part II, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 872-885.
[18] A. W. Vieira, E. R. Nascimento, G. L. Oliveira, Z. Liu, and M. F. M. Campos, 'STOP: Space-Time Occupancy Patterns for 3D Action Recognition from Depth Map Sequences,' in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: 17th Iberoamerican Congress, CIARP 2012, Buenos Aires, Argentina, September 3-6, 2012. Proceedings, L. Alvarez, M. Mejail, L. Gomez, and J. Jacobo, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 252-259.
[19] L. Wang, Y. Qiao, and X. Tang, 'Video Action Detection with Relational Dynamic-Poselets,' in Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ed Cham: Springer International Publishing, 2014, pp. 565-580.
[20] P. Banerjee and R. Nevatia, 'Pose Filter Based Hidden-CRF Models for Activity Detection,' in Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., ed Cham: Springer International Publishing, 2014,pp. 711-726.
[21] Y. Tian, R. Sukthankar, and M. Shah, 'Spatiotemporal Deformable Part Models for Action Detection,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2642-2649.
[22] J. Yuan, Z. Liu, and Y. Wu, 'Discriminative Video Pattern Search for Efficient Action Detection,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, pp. 1728-1743, 2011.
[23] Q. Luo, X. Kong, G. Zeng, and J. Fan, 'Human action detection via boosted local motion histograms,' Machine Vision and Applications, vol. 21, pp. 377-389, 2010.
[24] L. Cao, Z. Liu, and T. S. Huang, 'Cross-dataset action detection,' in 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010, pp.1998-2005.
[25] C. Chen, K. Liu, and N. Kehtarnavaz, 'Real-time human action recognition based on depth motion maps,' Journal of Real-Time Image Processing, vol. 12, pp. 155-163, 2016.
[26] G. Gkioxari and J. Malik, 'Finding action tubes,' in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 759-768.
[27] M. Rauter, 'Reliable Human Detection and Tracking in Top-View Depth Images,' in 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2013, pp. 529-534.
[28] G. Hu, D. Reilly, B. Swinden, and Q. Gao, 'Human Activity Analysis in a 3D Bird’s-eye View,' in Image Analysis and Recognition: 11th International Conference, ICIAR 2014, Vilamoura, Portugal, October 22-24, 2014, Proceedings, Part II, A. Campilho and M. Kamel, Eds., ed Cham: Springer International Publishing, 2014, pp. 365-373.
[29] G. Hu, D. Reilly, M. Alnusayri, B. Swinden, and Q. Gao, 'DT-DT: Top-down Human Activity Analysis for Interactive Surface Applications,' presented at the Proceedings of the Ninth ACM International Conference on Interactive Tabletops and Surfaces, Dresden, Germany, 2014.
[30] O. Barnich and M. V. Droogenbroeck, 'ViBe: A Universal Background Subtraction Algorithm for Video Sequences,' IEEE Transactions on Image Processing, vol. 20, pp. 1709-1724, 2011.
[31] O. Barnich and M. V. Droogenbroeck, 'ViBE: A powerful random technique to estimate the background in video sequences,' in 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, pp. 945-948.
[32] B. E. Boser, I. M. Guyon, and V. N. Vapnik, 'A training algorithm for optimal margin classifiers,' presented at the Proceedings of the fifth annual workshop on Computational learning theory, Pittsburgh, Pennsylvania, USA, 1992.
[33] C. Cortes and V. Vapnik, 'Support-Vector Networks,' Machine Learning, vol. 20, pp. 273-297, 1995.
[34] C.-N. J. Yu and T. Joachims, 'Learning structural SVMs with latent variables,' presented at the Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Quebec, Canada, 2009.
[35] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, 'Object Detection with Discriminatively Trained Part-Based Models,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 1627-1645, 2010.
[36] N. Dalal and B. Triggs, 'Histograms of oriented gradients for human detection,' in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, pp. 886-893 vol. 1.
[37] L. Chen, H. Lin, and S. Li, 'Depth image enhancement for Kinect using region growing and bilateral filter,' in 2012 21st International Conference on Pattern Recognition (ICPR), 2012, pp. 3070-3073. [38] Y. L. Huang, T. W. Hsu, and S. Y. Chien, Edge-aware depth completion for pointcloud 3D scene visualization on an RGB-D camera,' in 2014 IEEE Conference on Visual Communications and Image Processing, 2014, pp. 422-425.
[39] J. Liu, X. Gong, and J. Liu, 'Guided inpainting and filtering for Kinect depth maps,' in 2012 21st International Conference on Pattern Recognition (ICPR) 2012, pp. 2055-2058.
[40] T. Q. Pham, 'Non-maximum Suppression Using Fewer than Two Comparisons per Pixel,' in Advanced Concepts for Intelligent Vision Systems: 12th International Conference, ACIVS 2010, Sydney, Australia, December 13-16, 2010, Proceedings, Part I, J. Blanc-Talon, D. Bone, W. Philips, D. Popescu, and P. Scheunders, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 438-451.
[41] D. Comaniciu and P. Meer, 'Mean shift: a robust approach toward feature space analysis,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, pp. 603-619, 2002.
[42] C.-C. Chang and C.-J. Lin, 'LIBSVM: A library for support vector machines,' ACM Trans. Intell. Syst. Technol., vol. 2, pp. 1-27, 2011.
[43] N. Nguyen and Y. Guo, 'Comparisons of sequence labeling algorithms and extensions,' presented at the Proceedings of the 24th international conference on Machine learning, Corvalis, Oregon, USA, 2007.
[44] A. L. Yuille and A. Rangarajan, 'The concave-convex procedure,' Neural Comput., vol. 15, pp. 915-936, 2003.
[45] K. Zhang, L. Zhang, and M.-H. Yang, 'Real-time compressive tracking,' presented at the Proceedings of the 12th European conference on Computer Vision - Volume Part III, Florence, Italy, 2012.
[46] T. W. Hsu, Y. H. Yang, T. H. Yeh, A. S. Liu, L. C. Fu, and Y. C. Zeng, 'Privacy Free Indoor Action Detection System Using Top-View Depth Camera based on Key-poses,' in 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2016, pp. 2968-2973.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49327-
dc.description.abstract在本論文中,我們提出一個新穎的室內動作偵測系統,用於自動記錄使用者日常生活的行為。我們使用俯視深度攝影機來實現動作偵測系統,這讓本系統比較沒有隱私問題且較不會對使用者的日常生活造成干擾。另外相較於傳統側向視角或是監視視角的彩色攝影機而言,我們的系統較不會受到遮蔽或是光線變化的影響。
動作偵測的目的在於指出一段影像串流當中的哪裡和哪一個時間點有我們在意的動作發生。在本論文中,我們將一個動作拆解成一連串依照特定順序排列的重要姿勢。我們使用隱藏變量支持向量機建立某個動作的模型,在模型當中我們學習了重要姿勢的外觀和重要姿勢在時間中應該如何排列。我們使用兩種互補的特徵來描述姿勢。第一個是深度差距直方圖,我們使用這一個特徵描述姿勢的外觀;第二個是位置標誌特徵,這一個特徵描述了人與地板和其他非地板的物體,譬如:床或椅子間的空間關係。
我們使用召回率─精準率曲線和平均精準度來衡量我們的動作偵測系統。實驗結果顯示我們系統的精準度和強韌性。
zh_TW
dc.description.abstractIn this thesis, we propose a novel indoor daily activity detection system which can automatically keep the log of users’ daily life. The hardware setting here adopts top-view depth cameras which makes our system less privacy sensitive and annoying to the users. Moreover, in contrast with the traditional setting using side-view or surveillance-view RGB camera, our camera setting could avoid the problems of occlusion and illuminance change.
The goal of action detection is to identify where and when the actions of interest happened in a video stream. In this work, we regard the series of images of an action as a set of key-poses in images of the interested user which are arranged in a certain temporal order. To model an action, we use the latent SVM framework to jointly learn the appearance of the key-poses and the temporal locations of the key-poses. We use two kinds of features which are complementary to each other to describe postures of human. The first kind is the histogram of depth difference value which can encode the shape of the human poses. The second kind is the location-signified feature which can capture the spatial relations among the person of interest, floor and other non-floor objects, e.g., bed or chairs.
We use recall-precision curve and average precision (AP) to validate the proposed daily activity detection system and the experimental results show the accuracy and robustness of our system.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T11:23:50Z (GMT). No. of bitstreams: 1
ntu-105-R03921002-1.pdf: 3745240 bytes, checksum: f82f6d10f4f66235b08c6b150591d3dc (MD5)
Previous issue date: 2016
en
dc.description.tableofcontents誌謝 I
摘要 II
ABSTRACT III
TABLE OF CONTENTS IV
LIST OF FIGURES VII
LIST OF TABLES X
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Literature Review 2
1.2.1 Action Understanding 2
1.2.2 Action Recognition based on Depth Images 3
1.2.3 Action Detection 4
1.2.4 Applications based on Top-View Depth Camera 4
1.3 Contribution 6
1.4 Thesis Organization 7
Chapter 2 Preliminaries 8
2.1 ViBe Background Subtraction Algorithm 8
2.1.1 Introduction to ViBe Background Subtraction Algorithm 8
2.1.2 Pixel Model and Background Classification 9
2.1.3 Model Initialization 10
2.1.4 Model Updating 10
2.2 Support Vector Machine 12
2.2.1 Introduction to Support Vector Machine 12
2.2.2 Soft Margin SVM 14
2.2.3 Kernel Trick of SVM 15
2.3 Latent Support Vector Machine 15
2.3.1 Introduction to Latent SVM 16
2.3.2 Deformable Part Model 16
Chapter 3 Camera Setting and Image Preprocessing 20
3.1 Camera Setting 20
3.2 Preprocessing 21
3.2.1 Depth Image Inpainting 21
3.2.2 Background Subtraction 22
3.2.3 World Coordinate Mapping 26
Chapter 4 Human Detection and Tracking 30
4.1 Human Detection 30
4.1.1 Head Candidates Finding 31
4.1.2 Histogram of Depth Difference Feature and Head-Shoulder Classifier 32
4.2 Human Tracking 34
Chapter 5 Latent SVM for Action Model 37
5.1 The Action and Key-poses 37
5.2 Action Model Description 39
5.3 Feature extraction 40
5.4 Model Training 42
5.5 Online Action Detection 44
5.6 Action Suppression Mechanism 44
Chapter 6 Experiment 46
6.1 Description of Experimental Environment and Dataset 46
6.2 Human Tracking Results 49
6.3 Action Recognition Results 51
6.4 Action Detection Results 54
6.4.1 Action Suppression 54
6.4.2 Action Detection Result on Single Person Dataset 55
6.4.3 Action Detection Result on Multiple People Dataset 59
Chapter 7 Conclusion and Future Work 61
REFERENCE 63
dc.language.isoen
dc.subject隱藏變量支持向量機zh_TW
dc.subject動作偵測zh_TW
dc.subject重要姿勢zh_TW
dc.subjectlatent SVMen
dc.subjectkey-poseen
dc.subjectactivity detectionen
dc.title使用俯視深度攝影機之智慧家庭日常動作偵測系統zh_TW
dc.titleDaily Activity Detection System Using Top-View Depth Camera for Smart Home Environmenten
dc.typeThesis
dc.date.schoolyear104-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳祝嵩,范欽雄,莊永裕,黃正民
dc.subject.keyword動作偵測,重要姿勢,隱藏變量支持向量機,zh_TW
dc.subject.keywordactivity detection,key-pose,latent SVM,en
dc.relation.page66
dc.identifier.doi10.6342/NTU201602859
dc.rights.note有償授權
dc.date.accepted2016-08-18
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電機工程學研究所zh_TW
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  未授權公開取用
3.66 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved