舞蹈視界：自動化舞姿分析系統

李秉澤; Bing-Ze Li

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96136

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁建均	zh_TW
dc.contributor.advisor	Jian-Jiun Ding	en
dc.contributor.author	李秉澤	zh_TW
dc.contributor.author	Bing-Ze Li	en
dc.date.accessioned	2024-11-15T16:06:53Z	-
dc.date.available	2024-11-16	-
dc.date.copyright	2024-11-15	-
dc.date.issued	2024	-
dc.date.submitted	2024-10-29	-
dc.identifier.citation	REFERENCE [1] Barris, S., & Button, C. (2008). A review of vision-based motion analysis in sport. Sports Medicine, 38(12), 1025-1043. [2] Ortega, B. P., & Jiménez Olmedo, J. M. (2017). Application of motion capture technology for sport performance analysis. Retos: Nuevas Tendencias en Educación Física, Deporte y Recreación, 32, 241-247. [3] Sarupuri, B., Kulpa, R., Aristidou, A., & Multon, F. (2024). Dancing in virtual reality as an inclusive platform for social and physical fitness activities: A survey. The Visual Computer, 40(6), 4055-4070. [4] Logan, B. (2000, October). Mel frequency cepstral coefficients for music modeling. In ISMIR (Vol. 270, No. 1, p. 11). [5] Kulkarni, S., Deshmukh, S., Fernandes, F., Patil, A., & Jabade, V. (2023). Poseanalyser: A survey on human pose estimation. SN Computer Science, 4(2), 136. [6] Li, H., He, X., Barnes, N., & Wang, M. (2016). Learning hough transform with latent structures for joint object detection and pose estimation. In MultiMedia Modeling: 22nd International Conference, MMM 2016 (pp. 116-129). Springer. [7] Illingworth, J., & Kittler, J. (1988). A survey of the Hough transform. Computer Vision, Graphics, and Image Processing, 44(1), 87-116. [8] Sun, X., Xiao, B., Wei, F., Liang, S., & Wei, Y. (2018). Integral human pose regression. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 529-545). [9] Lea, C., Flynn, M. D., Vidal, R., Reiter, A., & Hager, G. D. (2017). Temporal convolutional networks for action segmentation and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 156-165). [10] Bragagnolo, L., Terreran, M., Allegro, D., & Ghidoni, S. (2024). Multi-view pose fusion for occlusion-aware 3D human pose estimation. arXiv preprint arXiv:2408.15810. [11] Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7291-7299). [12] Li, W., Wang, Z., Yin, B., Peng, Q., Du, Y., Xiao, T., Yu, G., Lu, H., Xu, C., & Sun, J. (2019). Rethinking on multi-stage networks for human pose estimation. arXiv preprint arXiv:1901.00148. [13] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Tao, D., Savvides, M., & Shah, M. (2023). Deep learning-based human pose estimation: A survey. ACM Computing Surveys, 56(1), 1-37. [14] Dai, Z., Kong, F., Zhou, Z., Cao, Q., Liu, X., & Hu, J. (2021). Improving pose estimation performance via online hard keypoints mining. Applied Sciences, 11(14), 6589. [15] Noori, F. M., Wallace, B., Uddin, M. Z., & Torresen, J. (2019, May). A robust human activity recognition approach using OpenPose, motion features, and deep recurrent neural network. In Scandinavian Conference on Image Analysis (pp. 299-310). Springer. [16] Osokin, D. (2018). Real-time 2D multi-person pose estimation on CPU: Lightweight OpenPose. arXiv preprint arXiv:1811.12004. [17] Zhang, M., Zhou, Y., Xu, X., Ren, Z., Zhang, Y., Liu, S., & Luo, W. (2023). Multi-view emotional expressions dataset using 2D pose estimation. Scientific Data, 10(1), 649. [18] Van Dyk, D. A., & Meng, X. L. (2001). The art of data augmentation. Journal of Computational and Graphical Statistics, 10(1), 1-50. [19] Danielsson, P. E. (1980). Euclidean distance mapping. Computer Graphics and Image Processing, 14(3), 227-248. [20] Müller, M. (2007). Dynamic time warping. Information Retrieval for Music and Motion (pp. 69-84). Springer. [21] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. [22] Zhao, R., Wang, D., Yan, R., Mao, K., Shen, F., & Wang, J. (2017). Machine health monitoring using local feature-based gated recurrent unit networks. IEEE Transactions on Industrial Electronics, 65(2), 1539-1548. [23] Tang, W., Long, G., Liu, L., Zhou, T., Jiang, J., & Blumenstein, M. (2020). Rethinking 1D-CNN for time series classification: A stronger baseline. arXiv preprint arXiv:2002.10061. [24] Zhang, Y., Zhang, F., Fang, L., & Chen, N. (2023). Inferring socioeconomic environment from built environment characteristics based on street view images: An approach of Seq2Seq method. International Journal of Applied Earth Observation and Geoinformation, 123, 103458. [25] Rodriguez, J. D., Perez, A., & Lozano, J. A. (2009). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569-575. [26] Bertolini, R., Finch, S. J., & Nehm, R. H. (2021). Enhancing data pipelines for forecasting student performance: Integrating feature selection with cross-validation. International Journal of Educational Technology in Higher Education, 18, 1-23. [27] Barber, C. B., Dobkin, D. P., & Huhdanpaa, H. (1996). The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software (TOMS), 22(4), 469-483.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96136	-
dc.description.abstract	在當前數位化的時代，自主學習的能力已成為個體提升知識和技能的重要機制，特別是在體育教育領域，自我導向的舞蹈學習尤為受到重視。然而，由於缺乏專業的評估，學習者往往難以掌握舞蹈動作的細微之處，也難以有效地評估自己與標準表現之間的差異。為了解決這一問題，本研究提出了兩種評估方法：第一種是多模型融合的自動化舞蹈評分系統，用於評估舞者的表現並提供針對性的改進建議；第二種則是通過比較關節角度和關節覆蓋面積與參考影片進行評分。在第一種方法中，系統主要利用姿態識別技術來捕捉舞者的關節運動數據，並基於收集到的關節數據，開發出一系列評估指標，包括歐幾里得距離、動態時間規劃（DTW）距離以及各種統計特徵差異，用來量化舞蹈動作的表現。為了增強數據集的多樣性，還使用了時間偏移、數據混合和平滑加噪音等數據增強技術。這些增強策略不僅能模擬多種場景下的舞蹈動作，還能增強模型在不同背景下的泛化能力。為了進一步提高評估的精確性，本研究結合了三種深度學習模型：長短期記憶網絡（LSTM）、門控循環單元（GRUs）和一維卷積神經網絡（1D CNNs）。這些模型的預測結果通過加權平均法進行融合，並根據模型在訓練數據集上的表現動態調整權重，確保融合後的預測能夠充分發揮每個模型的優勢。在第二種方法中，本研究提出了一種基於人體關節點分析的替代自動舞蹈動作評估系統。該系統的核心是通過特定公式計算出關節面積分數和角度分數。面積分數是根據關節所形成的多邊形面積計算的，通過分析關節之間的距離和位置變化，評估舞者的動作是否符合既定標準。相比之下，角度分數則根據關節之間的角度變化進行計算，反映出舞者動作的精確性和流暢性。這些分數隨後會經過加權處理，最終生成綜合的動作評估結果。通過這兩種系統，學習者可以獲得詳細的舞蹈動作評估，從而在自主學習過程中進行更精確的改進。	zh_TW
dc.description.abstract	In the contemporary digital landscape, the ability to engage in autonomous learning has become a crucial mechanism for individuals seeking to enhance their knowledge and skills, particularly in the field of sports education, where self-directed learning in dance performance is highly valued. However, the lack of professional assessment often impedes learners' understanding of the nuances of dance movements and complicates their efforts to evaluate the differences between their performances and established benchmarks. To address this challenge, the present study introduces two evaluative methodologies: the first is a multi-model fusion automated dance scoring system designed to assess dancers' performances and provide targeted recommendations for improvement, while the second involves comparing joint angles and joint coverage areas against a reference video to generate scores. In the initial methodology, the system primarily utilizes pose recognition technology to capture data on the dancer's joint movements. Using the collected joint data, a variety of evaluative metrics are developed, including Euclidean distance, Dynamic Time Warping (DTW) distance, and various statistical feature differences, to quantify the execution of dance movements. The dataset is further enhanced through data augmentation techniques such as time shifting, data mixing, and noise smoothing. These augmentation strategies not only replicate dance movements across diverse scenarios but also strengthen the model's generalization capabilities across different contexts. To enhance the accuracy of the evaluations, the study incorporates three deep learning models: Long Short-Term Memory networks (LSTM), Gated Recurrent Units (GRUs), and One-Dimensional Convolutional Neural Networks (1D CNNs). The predictive outcomes from these models are subsequently combined using a weighted average approach, with the weights assigned to each model dynamically adjusted based on their performance on the training dataset. This ensures that the integrated predictions effectively leverage the strengths of each individual model. In the second methodology, the study presents an alternative automated dance movement evaluation system based on the analysis of human joint points. This system is founded on the computation of joint area scores and angle scores using specific formulas. The area score is derived from the polygonal area defined by the joints, requiring a thorough analysis of the variations in distance and position among the joints to assess the alignment of the dancer's movements with established standards. In contrast, the angle score is calculated based on the changes in the angles between joints, reflecting the accuracy and fluidity of the dancer's movements. These scores are then weighted, resulting in a comprehensive evaluation of the movement. By implementing these two systems, learners receive detailed assessments of their dance movements, facilitating more precise improvements during their self-directed learning efforts.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-11-15T16:06:53Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-11-15T16:06:53Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 中文摘要 iii ABSTRACT v CONTENTS vii LIST OF FIGURES xi LIST OF TABLES xii Chapter 1 Introduction 1 1.1 Background 1 1.2 Main Contribution 4 1.3 Organization 8 Chapter 2 Database 9 2.1 Background of Data Processing 9 2.2 Data and Methods 10 2.2.1 Audio Extraction and Noise Reduction 13 2.2.2 Mel Frequency Cepstral Coefficients (MFCC) 14 Chapter 3 Joint Point Detection 16 3.1 Joint Point Detection 16 3.1.1 Traditional Image Processing Techniques 17 3.1.2 Deep Learning-Based Methods 17 3.1.2 Spatiotemporal Methods 18 3.1.2 Multi-View-Based Methods 19 3.2 OpenPose 20 3.3 Angle Calculation 23 3.3.1 Joint Selection 24 3.3.2 Joint Vector Calculation 25 3.3.3 Body Segment Angle Calculation 26 Chapter 4 Method 1: Joint-Based Dance Evaluation 28 4.1 Linear Interpolation 28 4.2 Data Augmentation 29 4.2.1 Time Shifting 29 4.2.2 Mixup 31 4.2.3 Noise 33 4.3 Feature Extraction 34 4.3.1 Euclidean Distance 34 4.3.2 Dynamic Time Warping 35 4.3.3 Mean Difference 37 4.3.4 Standard Deviation Difference 37 4.3.5 Minimum and Maximum Difference 38 4.3.6 Moving Average 39 4.3.7 Moving Standard Deviation 41 4.3.8 Angular Velocity 42 4.3.9 Acceleration Difference 43 4.4 Model Prediction 44 4.4.1 Model Selection and Bidirectional Design 45 4.4.2 Long Short-Term Memory (LSTM) 46 4.4.3 Gated Recurrent Units (GRUs) 47 4.4.4 One-Dimensional Convolutional Neural Network (1D CNNs) 48 4.4.5 Attention Layer 49 4.4.6 K-fold Cross-Validation 50 4.4.6 Weighted Ensemble 51 4.5 Results 53 Chapter 5 Method 2: Pose Recognition Dance Evaluation 60 5.1 Introduction 60 5.2 Music Alignment 61 5.3 Total Area and Angle Scores 62 5.3.1 Total Area Score Calculation 64 5.3.2 Detailed Calculation Process 66 5.3.3 Total Angle Score Calculation 68 5.3.4 Detailed Calculation Process 69 5.4 Results 72 Chapter 6 Conclusion 78 Reference 80	-
dc.language.iso	en	-
dc.subject	多模型融合	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	數據增強	zh_TW
dc.subject	姿態識別	zh_TW
dc.subject	舞蹈評分系統	zh_TW
dc.subject	Deep Learning	en
dc.subject	Dance Scoring System	en
dc.subject	Pose Recognition	en
dc.subject	Data Augmentation	en
dc.subject	Multi-Model Fusion	en
dc.title	舞蹈視界：自動化舞姿分析系統	zh_TW
dc.title	Dance Vision: Automated Dance Posture Analysis System	en
dc.type	Thesis	-
dc.date.schoolyear	113-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	劉俊麟;張榮吉;曾易聰	zh_TW
dc.contributor.oralexamcommittee	Chun-Lin Liu;Rong-Chi Chang;Yi-Chong Zeng	en
dc.subject.keyword	舞蹈評分系統,姿態識別,數據增強,多模型融合,深度學習,	zh_TW
dc.subject.keyword	Dance Scoring System,Pose Recognition,Data Augmentation,Multi-Model Fusion,Deep Learning,	en
dc.relation.page	82	-
dc.identifier.doi	10.6342/NTU202404521	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2024-10-29	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電信工程學研究所	-
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf	2.26 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。