Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96136
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor丁建均zh_TW
dc.contributor.advisorJian-Jiun Dingen
dc.contributor.author李秉澤zh_TW
dc.contributor.authorBing-Ze Lien
dc.date.accessioned2024-11-15T16:06:53Z-
dc.date.available2024-11-16-
dc.date.copyright2024-11-15-
dc.date.issued2024-
dc.date.submitted2024-10-29-
dc.identifier.citationREFERENCE
[1] Barris, S., & Button, C. (2008). A review of vision-based motion analysis in sport. Sports Medicine, 38(12), 1025-1043.
[2] Ortega, B. P., & Jiménez Olmedo, J. M. (2017). Application of motion capture technology for sport performance analysis. Retos: Nuevas Tendencias en Educación Física, Deporte y Recreación, 32, 241-247.
[3] Sarupuri, B., Kulpa, R., Aristidou, A., & Multon, F. (2024). Dancing in virtual reality as an inclusive platform for social and physical fitness activities: A survey. The Visual Computer, 40(6), 4055-4070.
[4] Logan, B. (2000, October). Mel frequency cepstral coefficients for music modeling. In ISMIR (Vol. 270, No. 1, p. 11).
[5] Kulkarni, S., Deshmukh, S., Fernandes, F., Patil, A., & Jabade, V. (2023). Poseanalyser: A survey on human pose estimation. SN Computer Science, 4(2), 136.
[6] Li, H., He, X., Barnes, N., & Wang, M. (2016). Learning hough transform with latent structures for joint object detection and pose estimation. In MultiMedia Modeling: 22nd International Conference, MMM 2016 (pp. 116-129). Springer.
[7] Illingworth, J., & Kittler, J. (1988). A survey of the Hough transform. Computer Vision, Graphics, and Image Processing, 44(1), 87-116.
[8] Sun, X., Xiao, B., Wei, F., Liang, S., & Wei, Y. (2018). Integral human pose regression. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 529-545).
[9] Lea, C., Flynn, M. D., Vidal, R., Reiter, A., & Hager, G. D. (2017). Temporal convolutional networks for action segmentation and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 156-165).
[10] Bragagnolo, L., Terreran, M., Allegro, D., & Ghidoni, S. (2024). Multi-view pose fusion for occlusion-aware 3D human pose estimation. arXiv preprint arXiv:2408.15810.
[11] Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7291-7299).
[12] Li, W., Wang, Z., Yin, B., Peng, Q., Du, Y., Xiao, T., Yu, G., Lu, H., Xu, C., & Sun, J. (2019). Rethinking on multi-stage networks for human pose estimation. arXiv preprint arXiv:1901.00148.
[13] Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Tao, D., Savvides, M., & Shah, M. (2023). Deep learning-based human pose estimation: A survey. ACM Computing Surveys, 56(1), 1-37.
[14] Dai, Z., Kong, F., Zhou, Z., Cao, Q., Liu, X., & Hu, J. (2021). Improving pose estimation performance via online hard keypoints mining. Applied Sciences, 11(14), 6589.
[15] Noori, F. M., Wallace, B., Uddin, M. Z., & Torresen, J. (2019, May). A robust human activity recognition approach using OpenPose, motion features, and deep recurrent neural network. In Scandinavian Conference on Image Analysis (pp. 299-310). Springer.
[16] Osokin, D. (2018). Real-time 2D multi-person pose estimation on CPU: Lightweight OpenPose. arXiv preprint arXiv:1811.12004.
[17] Zhang, M., Zhou, Y., Xu, X., Ren, Z., Zhang, Y., Liu, S., & Luo, W. (2023). Multi-view emotional expressions dataset using 2D pose estimation. Scientific Data, 10(1), 649.
[18] Van Dyk, D. A., & Meng, X. L. (2001). The art of data augmentation. Journal of Computational and Graphical Statistics, 10(1), 1-50.
[19] Danielsson, P. E. (1980). Euclidean distance mapping. Computer Graphics and Image Processing, 14(3), 227-248.
[20] Müller, M. (2007). Dynamic time warping. Information Retrieval for Music and Motion (pp. 69-84). Springer.
[21] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
[22] Zhao, R., Wang, D., Yan, R., Mao, K., Shen, F., & Wang, J. (2017). Machine health monitoring using local feature-based gated recurrent unit networks. IEEE Transactions on Industrial Electronics, 65(2), 1539-1548.
[23] Tang, W., Long, G., Liu, L., Zhou, T., Jiang, J., & Blumenstein, M. (2020). Rethinking 1D-CNN for time series classification: A stronger baseline. arXiv preprint arXiv:2002.10061.
[24] Zhang, Y., Zhang, F., Fang, L., & Chen, N. (2023). Inferring socioeconomic environment from built environment characteristics based on street view images: An approach of Seq2Seq method. International Journal of Applied Earth Observation and Geoinformation, 123, 103458.
[25] Rodriguez, J. D., Perez, A., & Lozano, J. A. (2009). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569-575.
[26] Bertolini, R., Finch, S. J., & Nehm, R. H. (2021). Enhancing data pipelines for forecasting student performance: Integrating feature selection with cross-validation. International Journal of Educational Technology in Higher Education, 18, 1-23.
[27] Barber, C. B., Dobkin, D. P., & Huhdanpaa, H. (1996). The quickhull algorithm for convex hulls. ACM Transactions on Mathematical Software (TOMS), 22(4), 469-483.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96136-
dc.description.abstract在當前數位化的時代,自主學習的能力已成為個體提升知識和技能的重要機制,特別是在體育教育領域,自我導向的舞蹈學習尤為受到重視。然而,由於缺乏專業的評估,學習者往往難以掌握舞蹈動作的細微之處,也難以有效地評估自己與標準表現之間的差異。為了解決這一問題,本研究提出了兩種評估方法:第一種是多模型融合的自動化舞蹈評分系統,用於評估舞者的表現並提供針對性的改進建議;第二種則是通過比較關節角度和關節覆蓋面積與參考影片進行評分。
在第一種方法中,系統主要利用姿態識別技術來捕捉舞者的關節運動數據,並基於收集到的關節數據,開發出一系列評估指標,包括歐幾里得距離、動態時間規劃(DTW)距離以及各種統計特徵差異,用來量化舞蹈動作的表現。為了增強數據集的多樣性,還使用了時間偏移、數據混合和平滑加噪音等數據增強技術。這些增強策略不僅能模擬多種場景下的舞蹈動作,還能增強模型在不同背景下的泛化能力。為了進一步提高評估的精確性,本研究結合了三種深度學習模型:長短期記憶網絡(LSTM)、門控循環單元(GRUs)和一維卷積神經網絡(1D CNNs)。這些模型的預測結果通過加權平均法進行融合,並根據模型在訓練數據集上的表現動態調整權重,確保融合後的預測能夠充分發揮每個模型的優勢。
在第二種方法中,本研究提出了一種基於人體關節點分析的替代自動舞蹈動作評估系統。該系統的核心是通過特定公式計算出關節面積分數和角度分數。面積分數是根據關節所形成的多邊形面積計算的,通過分析關節之間的距離和位置變化,評估舞者的動作是否符合既定標準。相比之下,角度分數則根據關節之間的角度變化進行計算,反映出舞者動作的精確性和流暢性。這些分數隨後會經過加權處理,最終生成綜合的動作評估結果。通過這兩種系統,學習者可以獲得詳細的舞蹈動作評估,從而在自主學習過程中進行更精確的改進。
zh_TW
dc.description.abstractIn the contemporary digital landscape, the ability to engage in autonomous learning has become a crucial mechanism for individuals seeking to enhance their knowledge and skills, particularly in the field of sports education, where self-directed learning in dance performance is highly valued. However, the lack of professional assessment often impedes learners' understanding of the nuances of dance movements and complicates their efforts to evaluate the differences between their performances and established benchmarks. To address this challenge, the present study introduces two evaluative methodologies: the first is a multi-model fusion automated dance scoring system designed to assess dancers' performances and provide targeted recommendations for improvement, while the second involves comparing joint angles and joint coverage areas against a reference video to generate scores.
In the initial methodology, the system primarily utilizes pose recognition technology to capture data on the dancer's joint movements. Using the collected joint data, a variety of evaluative metrics are developed, including Euclidean distance, Dynamic Time Warping (DTW) distance, and various statistical feature differences, to quantify the execution of dance movements. The dataset is further enhanced through data augmentation techniques such as time shifting, data mixing, and noise smoothing. These augmentation strategies not only replicate dance movements across diverse scenarios but also strengthen the model's generalization capabilities across different contexts. To enhance the accuracy of the evaluations, the study incorporates three deep learning models: Long Short-Term Memory networks (LSTM), Gated Recurrent Units (GRUs), and One-Dimensional Convolutional Neural Networks (1D CNNs). The predictive outcomes from these models are subsequently combined using a weighted average approach, with the weights assigned to each model dynamically adjusted based on their performance on the training dataset. This ensures that the integrated predictions effectively leverage the strengths of each individual model.
In the second methodology, the study presents an alternative automated dance movement evaluation system based on the analysis of human joint points. This system is founded on the computation of joint area scores and angle scores using specific formulas. The area score is derived from the polygonal area defined by the joints, requiring a thorough analysis of the variations in distance and position among the joints to assess the alignment of the dancer's movements with established standards. In contrast, the angle score is calculated based on the changes in the angles between joints, reflecting the accuracy and fluidity of the dancer's movements. These scores are then weighted, resulting in a comprehensive evaluation of the movement. By implementing these two systems, learners receive detailed assessments of their dance movements, facilitating more precise improvements during their self-directed learning efforts.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-11-15T16:06:53Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-11-15T16:06:53Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
中文摘要 iii
ABSTRACT v
CONTENTS vii
LIST OF FIGURES xi
LIST OF TABLES xii
Chapter 1 Introduction 1
1.1 Background 1
1.2 Main Contribution 4
1.3 Organization 8
Chapter 2 Database 9
2.1 Background of Data Processing 9
2.2 Data and Methods 10
2.2.1 Audio Extraction and Noise Reduction 13
2.2.2 Mel Frequency Cepstral Coefficients (MFCC) 14
Chapter 3 Joint Point Detection 16
3.1 Joint Point Detection 16
3.1.1 Traditional Image Processing Techniques 17
3.1.2 Deep Learning-Based Methods 17
3.1.2 Spatiotemporal Methods 18
3.1.2 Multi-View-Based Methods 19
3.2 OpenPose 20
3.3 Angle Calculation 23
3.3.1 Joint Selection 24
3.3.2 Joint Vector Calculation 25
3.3.3 Body Segment Angle Calculation 26
Chapter 4 Method 1: Joint-Based Dance Evaluation 28
4.1 Linear Interpolation 28
4.2 Data Augmentation 29
4.2.1 Time Shifting 29
4.2.2 Mixup 31
4.2.3 Noise 33
4.3 Feature Extraction 34
4.3.1 Euclidean Distance 34
4.3.2 Dynamic Time Warping 35
4.3.3 Mean Difference 37
4.3.4 Standard Deviation Difference 37
4.3.5 Minimum and Maximum Difference 38
4.3.6 Moving Average 39
4.3.7 Moving Standard Deviation 41
4.3.8 Angular Velocity 42
4.3.9 Acceleration Difference 43
4.4 Model Prediction 44
4.4.1 Model Selection and Bidirectional Design 45
4.4.2 Long Short-Term Memory (LSTM) 46
4.4.3 Gated Recurrent Units (GRUs) 47
4.4.4 One-Dimensional Convolutional Neural Network (1D CNNs) 48
4.4.5 Attention Layer 49
4.4.6 K-fold Cross-Validation 50
4.4.6 Weighted Ensemble 51
4.5 Results 53
Chapter 5 Method 2: Pose Recognition Dance Evaluation 60
5.1 Introduction 60
5.2 Music Alignment 61
5.3 Total Area and Angle Scores 62
5.3.1 Total Area Score Calculation 64
5.3.2 Detailed Calculation Process 66
5.3.3 Total Angle Score Calculation 68
5.3.4 Detailed Calculation Process 69
5.4 Results 72
Chapter 6 Conclusion 78
Reference 80
-
dc.language.isoen-
dc.subject多模型融合zh_TW
dc.subject深度學習zh_TW
dc.subject數據增強zh_TW
dc.subject姿態識別zh_TW
dc.subject舞蹈評分系統zh_TW
dc.subjectDeep Learningen
dc.subjectDance Scoring Systemen
dc.subjectPose Recognitionen
dc.subjectData Augmentationen
dc.subjectMulti-Model Fusionen
dc.title舞蹈視界:自動化舞姿分析系統zh_TW
dc.titleDance Vision: Automated Dance Posture Analysis Systemen
dc.typeThesis-
dc.date.schoolyear113-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee劉俊麟;張榮吉;曾易聰zh_TW
dc.contributor.oralexamcommitteeChun-Lin Liu;Rong-Chi Chang;Yi-Chong Zengen
dc.subject.keyword舞蹈評分系統,姿態識別,數據增強,多模型融合,深度學習,zh_TW
dc.subject.keywordDance Scoring System,Pose Recognition,Data Augmentation,Multi-Model Fusion,Deep Learning,en
dc.relation.page82-
dc.identifier.doi10.6342/NTU202404521-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-10-29-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-1.pdf2.26 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved