Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94645
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪一平zh_TW
dc.contributor.advisorYi-Ping Hungen
dc.contributor.author黃舒盟zh_TW
dc.contributor.authorShu-Meng Huangen
dc.date.accessioned2024-08-16T17:17:32Z-
dc.date.available2024-08-17-
dc.date.copyright2024-08-16-
dc.date.issued2024-
dc.date.submitted2024-08-12-
dc.identifier.citationReferences
Adobe Inc. Mixamo. https://www.mixamo.com/, 2024.
T. Ahmad, L. Jin, X. Zhang, S. Lai, G. Tang, and L. Lin. Graph convolutional neural network for human action recognition: A comprehensive survey. IEEE Transactions on Artificial Intelligence, 2(2):128–145, 2021.
A. Anilkumar, A. KT, S. Sajan, and S. KA. Pose estimated yoga monitoring system. In Proceedings of the International Conference on IoT Based Control Networks & Intelligent Systems-ICICNIS, 2021.
J. S. Arlotti, W. O. Carroll, Y. Afifi, P. Talegaonkar, L. Albuquerque, J. E. Ball, H. Chander, A. Petway, et al. Benefits of imu-based wearables in sports medicine: Narrative review. InternationalJournal of Kinesiology and Sports Science, 10(1):36–43, 2022.
A. Badiola-Bengoa and A. Mendez-Zorrilla. A systematic review of the application of camera-based human pose estimation in the field of sport and physical exercise. Sensors, 21(18):5996, 2021.
V. Bazarevsky, I. Grishchenko, K. Raveendran, T. Zhu, F. Zhang, and M. Grundmann. Blazepose: On-device real-time body pose tracking. arXiv preprint arXiv:2006.10204, 2020.
Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299, 2017.
S. Chen and R. R. Yang. Pose trainer: correcting exercise posture using pose estimation. arXiv preprint arXiv:2006.11718, 2020.
A. Da Gama, P. Fallavollita, V. Teichrieb, and N. Navab. Motor rehabilitation using kinect: a systematic review. Games for health journal, 4(2):123–135, 2015.
B. Dittakavi, D. Bavikadi, S. V. Desai, S. Chakraborty, N. Reddy, V. N. Balasubramanian, B. Callepalli, and A. Sharma. Pose tutor: an explainable system for pose correction in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3540–3549, 2022.
H. Duan, J. Wang, K. Chen, and D. Lin. Pyskl: Towards good practices for skeleton action recognition. In Proceedings of the 30th ACM International Conference on Multimedia, pages 7351–7354, 2022.
H.-S. Fang, J. Li, H. Tang, C. Xu, H. Zhu, Y. Xiu, Y.-L. Li, and C. Lu. Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
FFmpeg team. Ffmpeg. https://ffmpeg.org/, 2024. Accessed: 2024-07-18.
A. Garbett, Z. Degutyte, J. Hodge, and A. Astell. Towards understanding people's experiences of ai computer vision fitness instructor apps. In Proceedings of the 2021 ACM Designing Interactive Systems Conference, pages 1619–1637, 2021.
Y.-F. Jan, K.-W. Tseng, P.-Y. Kao, and Y.-P. Hung. Augmented tai-chi chuan practice tool with pose evaluation. In 2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR), pages 35–41. IEEE, 2021.
J. Li, S. Bian, C. Xu, Z. Chen, L. Yang, and C. Lu. Hybrik-x: Hybrid analytical-neural inverse kinematics for whole-body mesh recovery. arXiv preprint arXiv:2304.05690, 2023.
L. Li, M. Wang, B. Ni, H. Wang, J. Yang, and W. Zhang. 3d human action representation learning via cross-view consistency pursuit. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4741–4750, 2021.
J. Liu, M. Shi, Q. Chen, H. Fu, and C.-L. Tai. Normalized human pose features for human action video alignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11521–11531, 2021.
J. Liu, Y. Zheng, K. Wang, Y. Bian, W. Gai, and D. Gao. A real-time interactive tai chi learning system based on vr and motion capture technology. Procedia Computer Science, 174:712–719, 2020.
T. Liu, J. J. Sun, L. Zhao, J. Zhao, L. Yuan, Y. Wang, L.-C. Chen, F. Schroff, and H. Adam. View-invariant, occlusion-robust probabilistic embedding for human pose. International Journal of Computer Vision, 130(1):111–135, 2022.
Y. Okada, T. Ogata, and H. Matsuguma. Component-based approach for prototyping of tai chi-based physical therapy game and its performance evaluations. Computers in Entertainment (CIE), 14(1):1–20, 2016.
J. Park, S. Cho, D. Kim, O. Bailo, H. Park, S. Hong, and J. Park. A body part embedding model with datasets for measuring 2d human motion similarity. IEEE Access, 9:36547–36558, 2021.
G. Research. On-device, real-time body pose tracking with mediapipe blazepose. https://research.google/blog/on-device-real-time-body-pose-trackingwith-mediapipe-blazepose/, aug 2020.
Riverbank Computing. Pyqt. https://riverbankcomputing.com/, 2024. Accessed: 2024-07-26.
F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
L. Shi, Y. Zhang, J. Cheng, and H. Lu. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12026–12035, 2019.
J. J. Sun, J. Zhao, L.-C. Chen, F. Schroff, H. Adam, and T. Liu. View-invariant probabilistic embedding for human pose. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 53–70. Springer, 2020.
L. Wang, B. Su, Q. Liu, R. Gao, J. Zhang, and G. Wang. Human action recognition based on skeleton information and multi-feature fusion. Electronics, 12(17):3702, 2023.
Y. Xu, J. Zhang, Q. Zhang, and D. Tao. Vitpose: Simple vision transformer baselines for human pose estimation. Advances in Neural Information Processing Systems, 35:38571–38584, 2022.
S. Yan, Y. Xiong, and D. Lin. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
T. Yoshinaga and M. Soga. Motion comparison learning support environment for learner's motion synchronized with archived target expert's motion. Procedia computer science, 126:2153–2162, 2018.
L. Zhao, Y. Wang, J. Zhao, L. Yuan, J. J. Sun, F. Schroff, H. Adam, X. Peng, D. Metaxas, and T. Liu. Learning view-disentangled human pose representation by contrastive cross-view mutual information maximization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12793–12802, 2021.
W. Zhu, X. Ma, Z. Liu, L. Liu, W. Wu, and Y. Wang. Motionbert: A unified perspective on learning human motion representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15085–15099, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94645-
dc.description.abstract24式太極拳是傳統楊氏太極拳的簡化版本,它在保留了太極核心動作的同時,減少了招式的複雜性,使其更容易學習,適合作為全民健身運動來推廣。而對於初學者而言,跟隨教學影片中教練的動作來學習太極招式是最簡便的方式,然而影片並不會提供任何反饋,使學習者難以得知自己的動作的正確性。

隨著人工智慧的發展和姿態估計模型的逐漸成熟,現在可以從一般的網路攝影機拍攝的影像,推估出影像中人們的骨架關節點資訊,並進一步利用這些資訊對人的動作進行分析,然而想評估學習者的動作是否和教練相符,需要太極拳的專業知識並手工設計相似度計算方法。為此,我們收集大量太極動作影片並將其轉成骨架資訊,利用圖卷積神經網路模型,以資料去驅動模型學習太極動作的特徵,將一個動作(Motion)轉為一個具該動作特徵的向量表示(Embedding),並以三元組損失函數(Triplet Loss)優化這些向量表示,使相似的動作向量更加接近,不相似的動作向量更加遠離。通過這種方法,我們便能簡單的利用餘弦相似性去評估兩個動作向量間的相似性,並作為評分,即時的反饋給學習者。實驗結果顯示,使用模型輸出的動作向量來計算動作相似度,能夠有效地辨別相似動作和相異動作。相較於直接比較動作的骨架關節點坐標位置,我們的方法能提升至多24%的辨別準確率,提供更為穩定且明確的評分。

為了使模型易於使用,我們將模型與教學影片整合成一個界面,使學習者在跟隨教練動作的同時,可以實時看到自己當前動作的評級。在練習完招式後,學習者能檢視各評級中自己與教練動作的差異,從而改善動作,提升訓練效果。
zh_TW
dc.description.abstractThe 24-form Tai Chi is a simplified version of traditional Yang-style Tai Chi Chuan. By retaining the essential movements while reducing the complexity of the techniques, it is easier to learn and suitable for promoting as a fitness exercise for the general public. For beginners, learning Tai Chi Chuan by following instructional videos is the most accessible way. However, these videos do not provide any feedback, making it difficult for learners to know if their movements are correct.

With advancements in Artificial Intelligence and pose estimation, it is now possible to estimate the skeleton joint data of people in videos captured by standard webcams. This data can be used to analyze human motions. However, evaluating whether the learner's motions match the instructor's requires domain knowledge in Tai Chi Chuan and effort to manually design similarity evaluation methods. To address this challenge, we created a motion dataset of 24-form Tai Chi and used a graph convolutional network to learn the features of Tai Chi movements from the data. The model converts skeleton motion data into motion embeddings that capture the motion features. Trained with a triplet loss function, it ensures that the embeddings of similar motions are closer together, while those of dissimilar motions are further apart. This allows us to simply use cosine similarity to evaluate the similarity between two motion embeddings and provide real-time feedback to learners. Experimental results show that evaluating similarity using our motion embeddings can effectively differentiate similar and dissimilar motions, with an accuracy improvement of up to 24% over joint coordinate comparisons, offering clearer and more consistent similarity scores.

For user-friendly purposes, we integrated our model with instructional videos into an interface that allows users to see real-time feedback on their movements while following the instructor. After practice, users can review the differences between their movements and the instructor's within each rating range, enabling them to make improvements and enhance their practice effectiveness.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:17:32Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-16T17:17:32Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract v
Contents vii
List of Figures ix
List of Tables xi
Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2 RELATED WORK 5
2.1 Similarity Evaluation in Exercise-Related Systems . . . . . . . . . . 5
2.2 Skeleton-Based Motion Representation Learning . . . . . . . . . . . 10
Chapter 3 APPROACH 13
3.1 24-form Tai-Chi Dataset . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Pose Estimation Models . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Pose Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Motion Embedding Model . . . . . . . . . . . . . . . . . . . . . . . 20
3.5 Triplet Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.6 Triplet Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Chapter 4 EXPERIMENTS 30
4.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Motion Similarity Evaluation Results . . . . . . . . . . . . . . . . . 33
Chapter 5 Interface 39
5.1 Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2 Interface Demonstration . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 6 Conclusion 43
References 45
-
dc.language.isoen-
dc.title透過姿態估計與特徵學習實現太極拳輔助學習系統的實時反饋zh_TW
dc.titleReal-Time Feedback via Pose Estimation and Representation Learning for a Tai-Chi Chuan Assisted Learning Systemen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee巫芳璟;康立威;姚書農;王鵬華zh_TW
dc.contributor.oralexamcommitteeFang-Jing Wu;Li-Wei Kang;Shu-Nung Yao;Peng-Hua Wangen
dc.subject.keyword動作分析,特徵學習,姿態估計,圖卷積神經網路,太極拳,zh_TW
dc.subject.keywordMotion Analysis,Representation Learning,Pose Estimation,Graph Convolutional Network,Tai-Chi Chuan,en
dc.relation.page49-
dc.identifier.doi10.6342/NTU202404118-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2024-08-13-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
15.33 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved