棒球場景中的時序二維姿態辨識優化與應用於無監督三維人體動作辨識

黃桂廷; Kuei-Ting Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93177

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	吳育任	zh_TW
dc.contributor.advisor	Yuh-Renn Wu	en
dc.contributor.author	黃桂廷	zh_TW
dc.contributor.author	Kuei-Ting Huang	en
dc.date.accessioned	2024-07-23T16:08:40Z	-
dc.date.available	2024-07-24	-
dc.date.copyright	2024-07-23	-
dc.date.issued	2024	-
dc.date.submitted	2024-07-01	-
dc.identifier.citation	[1] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in Computer Vision– ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755, Springer, 2014. [2] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D human pose estimation: New benchmark and state of the art analysis,” in Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pp. 3686–3693, 2014. [3] X. Nie, J. Feng, J. Zhang, and S. Yan, “Single-stage multi-person pose machines,” in Proceedings of the IEEE/ CVF international conference on computer vision, pp. 6951–6960, 2019. [4] H. S. Fang, S. Xie, Y. W. Tai, and C. Lu, “RMPE: Regional multi-person pose estimation,” in ICCV, 2017. [5] H. S. Fang, J. Li, H. Tang, C. Xu, H. Zhu, Y. Xiu, Y. L. Li, and C. Lu, “AlphaPose: Whole-body regional multi-person pose estimation and tracking in real-time,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. [6] J. Li, C. Wang, H. Zhu, Y. Mao, H. S. Fang, and C. Lu, “CrowdPose: Efficient crowded scenes pose estimation and a new benchmark,” in Proceedings of the IEEE/ CVF conference on computer vision and pattern recognition, pp. 10863–10872, 2019. [7] Y. Cai, Z. Wang, Z. Luo, B. Yin, A. Du, H. Wang, X. Zhang, X. Zhou, E. Zhou, and J. Sun, “Learning delicate local representations for multi-person pose estimation,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pp. 455–472, Springer, 2020. [8] S. Yang, Z. Quan, M. Nie, and W. Yang, “TransPose: Keypoint localization via transformer,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11802–11812, 2021. [9] Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7291–7299, 2017. [10] Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “ViTPose: Simple vision transformer baselines for human pose estimation,” Advances in Neural Information Processing Systems, vol. 35, pp. 38571–38584, 2022. [11] C. Ionescu, D. Papava, V. Olaru, and C. Sminchisescu, “Human3. 6M: Large scale datasets and predictive methods for 3D human sensing in natural environments,” IEEE transactions on pattern analysis and machine intelligence, vol. 36, no. 7, pp. 1325–1339, 2013. [12] D. Mehta, H. Rhodin, D. Casas, P. Fua, O. Sotnychenko, W. Xu, and C. Theobalt, “Monocular 3D human pose estimation in the wild using improved cnn supervision,” in 2017 international conference on 3D vision (3DV), pp. 506–516, IEEE, 2017. [13] H. Joo, H. Liu, L. Tan, L. Gui, B. Nabbe, I. Matthews, T. Kanade, S. Nobuhara, and Y. Sheikh, “Panoptic studio: A massively multiview system for social motion capture,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 3334–3342, 2015. [14] G. Pavlakos, X. Zhou, K. G. Derpanis, and K. Daniilidis, “Coarse-to-fine volumetric prediction for single-image 3D human pose,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7025–7034, 2017. [15] J. Martinez, R. Hossain, J. Romero, and J. J. Little, “A simple yet effective baseline for 3D human pose estimation,” in Proceedings of the IEEE international conference on computer vision, pp. 2640–2649, 2017. [16] B. Tekin, P. Márquez-Neila, M. Salzmann, and P. Fua, “Learning to fuse 2D and 3D image cues for monocular body pose estimation,” in Proceedings of the IEEE international conference on computer vision, pp. 3941–3950, 2017. [17] X. Zhou, X. Sun, W. Zhang, S. Liang, and Y. Wei, “Deep kinematic pose regression,” in Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14, pp. 186–201, Springer, 2016. [18] K. Iskakov, E. Burkov, V. Lempitsky, and Y. Malkov, “Learnable triangulation of human pose,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 7718–7727, 2019. [19] Y. Y. Lu, “Implications of the 2D human poses detector to reconstruction of 3D human pose in multi-camera system,” master’s thesis, National Taiwan University, 2020. [20] S. L. Delp, F. C. Anderson, A. S. Arnold, P. Loan, A. Habib, C. T. John, E. Guendelman, and D. G. Thelen, “OpenSim: open-source software to create and analyze dynamic simulations of movement,” IEEE transactions on biomedical engineering, vol. 54, no. 11, pp. 1940–1950, 2007. [21] A. Seth, J. L. Hicks, T. K. Uchida, A. Habib, C. L. Dembia, J. J. Dunne, C. F. Ong, M. S. DeMers, A. Rajagopal, M. Millard, et al., “OpenSim: Simulating musculoskeletal dynamics and neuromuscular control to study human and animal movement,” PLoS computational biology, vol. 14, no. 7, p. e1006223, 2018. [22] C. L. Dembia, N. A. Bianco, A. Falisse, J. L. Hicks, and S. L. Delp, “OpenSim Moco: Musculoskeletal optimal control,” PLOS Computational Biology, vol. 16, no. 12, p. e1008493, 2020. [23] J.-G. Lee, J. Han, and X. Li, “Trajectory outlier detection: A partition-and-detect framework,” in 2008 IEEE 24th International Conference on Data Engineering, pp. 140–149, IEEE, 2008. [24] G. Yuan, S. Xia, L. Zhang, Y. Zhou, and C. Ji, “Trajectory outlier detection algorithm based on structural features,” Journal of Computational Information Systems, vol. 7, no. 11, pp. 4137–4144, 2011. [25] Y. Xiu, J. Li, H. Wang, Y. Fang, and C. Lu, “Pose Flow: Efficient online pose tracking,” in BMVC, 2018. [26] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788, 2016. [27] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018. [28] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding Yolo series in 2021,” arXiv preprint arXiv:2107.08430, 2021. [29] X. Sun, B. Xiao, F. Wei, S. Liang, and Y. Wei, “Integral human pose regression,” in Proceedings of the European conference on computer vision (ECCV), pp. 529–545, 2018. [30] S. Jin, L. Xu, J. Xu, C. Wang, W. Liu, C. Qian, W. Ouyang, and P. Luo, “Wholebody human pose estimation in the wild,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16, pp. 196–214, Springer, 2020. [31] G. Bradski and A. Kaehler, Learning OpenCV: Computer vision with the OpenCV library. ” O’Reilly Media, Inc.”, 2008. [32] Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong, “A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry,” Artificial intelligence, vol. 78, no. 1-2, pp. 87–119, 1995. [33] A. Rajagopal, C. L. Dembia, M. S. DeMers, D. D. Delp, J. L. Hicks, and S. L. Delp, “Full-body musculoskeletal model for muscle-driven simulation of human gait,” IEEE Transactions on Biomedical Engineering, vol. 63, no. 10, pp. 2068– 2079, 2016. [34] Y. Zheng, “Trajectory data mining: an overview,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 6, no. 3, pp. 1–41, 2015. [35] E. M. Knorr, R. T. Ng, and V. Tucakov, “Distance-based outliers: algorithms and applications,” The VLDB Journal, vol. 8, no. 3, pp. 237–253, 2000. [36] S. Agarwal, N. Snavely, S. M. Seitz, and R. Szeliski, “Bundle adjustment in the large,” in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part II 11, pp. 29–42, Springer, 2010. [37] J. Soucie, C. Wang, A. Forsyth, S. Funk, M. Denny, K. Roach, D. Boone, and H. T. C. Network, “Range of motion measurements: reference values and a database for comparison studies,” Haemophilia, vol. 17, no. 3, pp. 500–507, 2011. [38] A. M. Oosterwijk, M. K. Nieuwenhuis, C. P. van der Schans, and L. J. Mouton, “Shoulder and elbow range of motion for the performance of activities of daily living: A systematic review,” Physiotherapy theory and practice, vol. 34, no. 7, pp. 505– 528, 2018. [39] M. M. Reinold, L. C. Macrina, G. S. Fleisig, K. Aune, and J. R. Andrews, “Effect of a 6-week weighted baseball throwing program on pitch velocity, pitching arm biomechanics, passive range of motion, and injury rates,” Sports health, vol. 10, no. 4, pp. 327–333, 2018. [40] H. Inokuchi, M. Tojima, H. Mano, Y. Ishikawa, N. Ogata, and N. Haga, “Neck range of motion measurements using a new three-dimensional motion analysis system: validity and repeatability,” European Spine Journal, vol. 24, pp. 2807–2815, 2015. [41] G. S. Fleisig, S. W. Barrentine, N. Zheng, R. F. Escamilla, and J. R. Andrews, “Kinematic and kinetic comparison of baseball pitching among various levels of development,” Journal of biomechanics, vol. 32, no. 12, pp. 1371–1375, 1999. [42] D. Fortenbaugh, G. S. Fleisig, and J. R. Andrews, “Baseball pitching biomechanics in relation to injury risk and performance,” Sports health, vol. 1, no. 4, pp. 314–320, 2009. [43] W. Schroeder, K. M. Martin, and W. E. Lorensen, The visualization toolkit an objectoriented approach to 3D graphics. Prentice-Hall, Inc., 1998. [44] B. A. MacWilliams, T. Choi, M. K. Perezous, E. Y. Chao, and E. G. McFarland, “Characteristic ground-reaction forces in baseball pitching,” The American journal of sports medicine, vol. 26, no. 1, pp. 66–71, 1998.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93177	-
dc.description.abstract	在科技日新月異的時代，運動員也能透過科技輔助使表現更上一層樓，本論文提出一套完整的無標記動作捕捉系統。分析動作能協助運動員追蹤自行表現的變化，無標記能避免影響運動員的表現，並能更廣泛的運用於各個場域。此系統透過AlphaPose進行二維影像辨識，以及多視角立體電腦視覺進行三維重建。使用棋盤版以及立方體對相機模型預先校正，以獲得相機矩陣參數。最後利用生物力學軟體OpenSim，運算人體動作的關節角度。配合真實球場架設的情況，相機偶爾會有些微移動，畫面偏移造成嚴重的三維誤差。此時可以透過最小平方法修正外部參數矩陣，不需前往球場重新進行校正。並且分析實際球場之校正誤差，在一公尺約佔180像素之影像大小中，投手活動範圍中的校正最大誤差約為10毫米。而攝影機對視可能造成較攝影機夾角90度的10倍誤差量。另外，夜間比賽的辨識效果不佳，造成部分外群值影響整體動作分析，本論文提出多個時序性的方法去除部分離群值，時間序列局部最大值透過影片的連續性改善AlphaPose的熱圖取點，關節配對法可以交換左右辨識相反的關節，自動異常偵測法自動尋找異常處並用一階擬合的方式改善辨識錯誤區間。最佳的演算法配合可使關節位置平均誤差達到30.31毫米。透過OpenSim能相當的穩定動作，分析手肘以及膝蓋角度分別有8.13度、5.15度的角度誤差。	zh_TW
dc.description.abstract	In this era of rapid technological advancement, athletes can enhance their performance with the assistance of technology. This thesis presents a markerless motion capture system. Players can track performance changes without being encumbered by markers, enabling broader applications across various fields. The system utilizes AlphaPose for 2-D image recognition and multi-view stereo computer vision for 3-D reconstruction. Camera models are pre-calibrated using a checkerboard pattern and a cube to obtain camera matrix parameters. Finally, joint angles of human movement are computed using the biomechanics software OpenSim. In real-world scenarios, cameras may experience slight movements, leading to significant 3-D errors due to image shifts. The external parameter matrices can be corrected using least squares methods without recalibration at the field. Analysis of calibration errors in actual stadiums reveals a maximum calibration error of approximately 10 mm within an image size of 180 pixels per meter. Opposite views for cameras can result in error quantities up to 10 times greater than those at camera angles of 90 degrees. Moreover, poor recognition performance during nighttime matches can introduce outliers affecting overall motion analysis. This thesis proposes several temporal methods to remove outliers, including leveraging video continuity for improving AlphaPose heatmap estimating, Matching Joints to swap incorrectly identified joints, Auto-detect Anomaly Range finding outliers, and using first-order fitting to remove recognition error intervals. Combining these algorithms achieves an average joint position error of 30.31 mm. OpenSim ensures stable motion analysis, with elbow and knee angle errors of 8.13 degrees and 5.15 degrees, respectively.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-23T16:08:40Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-07-23T16:08:40Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Abstract v Contents vii List of Figures x List of Tables xiii Chapter 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 2-D Human Pose Estimation . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Top-Down Approach . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Bottom-Up Approach . . . . . . . . . . . . . . . . . . . . . . 3 1.3 3-D Human Pose Estimation . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 3-D Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Single Camera 3-D HPE . . . . . . . . . . . . . . . . . . . . 4 1.3.3 Multi-Camera 3-D HPE . . . . . . . . . . . . . . . . . . . . 5 1.4 Problem of Application at Baseball Fields . . . . . . . . . . . . . . . 6 1.5 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.6 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2 Methodology 10 2.1 AlphaPose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.1 Heatmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.2 Keypoint Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 Intrinsic Parameter Matrix . . . . . . . . . . . . . . . . . . . 16 2.2.2 Lens Distortions . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.3 Extrinsic Parameter Matrix . . . . . . . . . . . . . . . . . . . 18 2.3 3-D Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4 OpenSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.1 OpenSense Model . . . . . . . . . . . . . . . . . . . . . . . 21 2.4.2 Scaling Skeletal Model . . . . . . . . . . . . . . . . . . . . . 21 2.4.3 Angle Calculation: Inverse Kinematic . . . . . . . . . . . . . 24 Chapter 3 Optimization of Pose Estimation and Motion Analysis 25 3.1 Local Maximum with Time Sequence . . . . . . . . . . . . . . . . . 25 3.2 2-D Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.1 Matching Joints . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.2 Auto-detect Abnormal Ranges . . . . . . . . . . . . . . . . . 33 3.3 Recalibration for Moved cameras . . . . . . . . . . . . . . . . . . . 37 3.3.1 Optimization of Extrinsic Parameter Matrix . . . . . . . . . . 38 3.3.2 Correction of The Coordinate System . . . . . . . . . . . . . 38 3.3.3 Reprojection Error . . . . . . . . . . . . . . . . . . . . . . . 41 3.4 3-D Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.5 OpenSim Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.5.1 OpenSense Model Revise . . . . . . . . . . . . . . . . . . . 45 3.5.2 Marker Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5.3 Scaling Model Setup . . . . . . . . . . . . . . . . . . . . . . 49 3.5.4 Inverse Kinematics Setup . . . . . . . . . . . . . . . . . . . 53 3.6 Phase in Pitching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.7 3-D Skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Chapter 4 Result and Discuss 60 4.1 Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.3 Influence of Error on Reconstruction . . . . . . . . . . . . . . . . . 64 4.4 Calibration Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.5 Pose Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.5.1 Result of LMT Algorithm . . . . . . . . . . . . . . . . . . . 74 4.5.2 Result of MJ Algorithm . . . . . . . . . . . . . . . . . . . . 75 4.5.3 Result of AAR Algorithm . . . . . . . . . . . . . . . . . . . 78 4.6 Angle Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Chapter 5 Conclusion and Future Work 84 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 References 87	-
dc.language.iso	en	-
dc.title	棒球場景中的時序二維姿態辨識優化與應用於無監督三維人體動作辨識	zh_TW
dc.title	Optimized Temporal 2-D Pose Recognition in Baseball Scenes Applied to Unsupervised 3-D Human Motion Detection	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	黃致豪;吳沛遠	zh_TW
dc.contributor.oralexamcommittee	Jyh-How Huang;Pei-Yuan Wu	en
dc.subject.keyword	人體姿態估計,立體視覺,離群值去除,人體模型,逆運動學,投球動作,	zh_TW
dc.subject.keyword	human pose estimation,stereo vision,outlier removal,human model,inverse kinematics,pitching posture,	en
dc.relation.page	94	-
dc.identifier.doi	10.6342/NTU202401355	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2024-07-02	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	光電工程學研究所	-
顯示於系所單位：	光電工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf	25.31 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。