Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/102273
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪一平zh_TW
dc.contributor.advisorYi-Ping Hungen
dc.contributor.author陳乙馨zh_TW
dc.contributor.authorI-Hsin Chenen
dc.date.accessioned2026-04-30T16:08:10Z-
dc.date.available2026-05-01-
dc.date.copyright2026-04-30-
dc.date.issued2026-
dc.date.submitted2026-04-13-
dc.identifier.citation[1] B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, "3d gaussian splatting for real-time radiance field rendering," ACM Transactions on Graphics, vol. 42, no. 4, 2023. [Online]. Available: https://repo-saminria.fr/fungraph/3d-gaussian-splatting/
[2] Apple Inc., ARKit: Apple's Augmented Reality Platform, https://developer.apple.com/augmented-reality/arkit/, Accessed: 2025-05-30, 2024.
[3] T. Li, T. Bolkart, M. J. Black, H. Li, and J. Romero, "Learning a model of facial shape and expression from 4D scans," ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), vol. 36, no. 6, 194:1-194:17, 2017. [Online]. Available: https://doi.org/10.1145/3130800.3130813
[4] T. Kirschstein, S. Qian, S. Giebenhain, T. Walter, and M. Nießner, "Nersemble: Multi-view radiance field reconstruction of human heads," ACM Trans. Graph., vol. 42, no. 4, Jul. 2023, ISSN: 0730-0301. DOI: 10.1145/3592455 [Online]. Available: https://doi.org/10.1145/3592455
[5] B. Egger, W. A. P. Smith, A. Tewari, S. Wuhrer, M. Zollhoefer, T. Beeler, F. Bernard, T. Bolkart, A. Kortylewski, S. Romdhani, C. Theobalt, V. Blanz, and T. Vetter, 3d morphable face models - past, present and future, 2020. arXiv: 1909. 01815 [cs.CV]. [Online]. Available: https://arxiv.org/abs/1909.01815
[6] P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter, "A 3d face model for pose and illumination invariant face recognition," IEEE, Genova, Italy, 2009.
[7] C. Cao, Y. Weng, S. Zhou, Y. Tong, and K. Zhou, "Facewarehouse: A 3d facial expression database for visual computing," IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 3, pp. 413-425, Mar. 2014, ISSN: 1077-2626. DOI: 10.1109/TVCG. 2013. 249 [Online]. Available: https://doi.org/10.1109/TVCG.2013.249
[8] S. Ma, Y. Weng, T. Shao, and K. Zhou, "3d gaussian blendshapes for head avatar animation," in ACM SIGGRAPH Conference Proceedings, Denver, CO, United States, July 28 - August 1, 2024, 2024.
[9] S. Qian, T. Kirschstein, L. Schoneveld, D. Davoli, S. Giebenhain, and M. Nießner, "Gaussianavatars: Photorealistic head avatars with rigged 3d gaussians," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 20 299-20 309.
[10] J. Xiang, X. Gao, Y. Guo, and J. Zhang, "Flashavatar: High-fidelity head avatar with efficient gaussian embedding," in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[11] S. Qian, Vhap: Versatile head alignment with adaptive appearance priors, Sep. 2024. DOI: 10.5281/zenodo. 14988309 [Online]. Available: https://github.com/ShenhanQian/VHAP
[12] Y. Feng, H. Feng, M. J. Black, and T. Bolkart, "Learning an animatable detailed 3D face model from in-the-wild images," 8, vol. 40, 2021. [Online]. Available: https://doi.org/10.1145/3450626.3459936
[13] G. Retsinas, P. P. Filntisis, R. Danecek, V. F. Abrevaya, A. Roussos, T. Bolkart, and P. Maragos, "3d facial expressions through analysis-by-neural-synthesis," in Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[14] H. Liu, Z. Zhu, G. Becherini, Y. Peng, M. Su, Y. Zhou, X. Zhe, N. Iwamoto, B. Zheng, and M. J. Black, Emage: Towards unified holistic co-speech gesture generation via expressive masked audio gesture modeling, 2024. arXiv: 2401.00374 [cs.CV]. [Online]. Available: https://arxiv.org/abs/2401.00374
[15] H. Liu, Z. Zhu, N. Iwamoto, Y. Peng, Z. Li, Y. Zhou, E. Bozkurt, and B. Zheng, "Beat: A large-scale semantic and emotional multi-modal dataset for conversational gestures synthesis," arXiv preprint arXiv:2203.05297, 2022.
[16] P. Ekman and W. V. Friesen, "Facial action coding system: A technique for the measurement of facial movement," Consulting Psychologists Press, Palo Alto, CA, Tech. Rep., 1978. DOI: 10.1037/T27734-000
[17] Face cap motion capture for ios, https://www.bannaflak.com/face-cap/, Accessed: 2025-08-06.
[18] J. Xing, M. Xia, Y. Zhang, X. Cun, J. Wang, and T.-T. Wong, "Codetalker: Speech-driven 3d facial animation with discrete motion prior," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 12780-12 790.
[19] S. Stan, K. I. Haque, and Z. Yumak, Facediffuser: Speech-driven 3d facial animation synthesis using diffusion, 2023. arXiv: 2309. 11306 [cs.CV]. [Online]. Available: https://arxiv.org/abs/2309.11306
[20] J. Ling, X. Tan, L. Chen, R. Li, Y. Zhang, S. Zhao, and L. Song, Stableface: Analyzing and improving motion stability for talking face generation, 2022. arXiv: 2208. 13717 [cs.CV]. [Online]. Available: https://arxiv.org/abs/2208.13717
[21] W. Guilluy, A. Beghdadi, and L. Oudre, "A performance evaluation framework for video stabilization methods," in 2018 7th European Workshop on Visual Information Processing (EUVIP), 2018, pp. 1-6. DOI: 10.1109/EUVIP.2018.8611729
[22] J. G. James, D. Jain, and A. Rajwade, Globalflownet: Video stabilization using deep distilled global motion estimates, 2022. arXiv: 2210. 13769 [cs.CV]. [Online]. Available: https://arxiv.org/abs/2210.13769
[23] Z. Wang and A. C. Bovik, "Mean squared error: Love it or leave it? a new look at signal fidelity measures," IEEE Signal Processing Magazine, vol. 26, no. 1, pp. 98-117, 2009. DOI: 10.1109/MSP.2008.930649
[24] Z. Zhang, L. Li, Y. Ding, and C. Fan, "Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3661-3670.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/102273-
dc.description.abstract本研究提出一套即時臉部動畫控制系統,利用商用動作捕捉軟體的數據,驅動並生成具備高度擬真感的三維高斯頭像 (3D Gaussian head avatars)。本框架的核心為一個高效的迴歸模型,將由動補軟體取得的融合變形 (blendshape) 參數映射至參數化人臉模型的表情空間中,從而實現對高斯頭像的直接控制,無需進行耗時的迭代最佳化。此外,我們的設計引入了一種輕量級的表情正規化機制,透過促進語意解耦 (semantically disentangled) 以及保留個人特徵的形變,進一步提升了動畫的穩定性與表現力。

在詳盡的實驗評估中(採用 ARKit 作為動態捕捉來源,並以 FLAME 作為目標參數化模型),結果顯示本方法在動畫品質與系統延遲上,均顯著優於現有基於擬合 (fitting-based) 與基於迴歸 (regression-based) 的基準方法。本系統同時支援即時串流與離線重演 (offline reenactment),為虛擬會議與遠距社交互動提供高效且即時的虛擬人控制方案。
zh_TW
dc.description.abstractWe present a real-time system for animating photorealistic 3D Gaussian head avatars driven by motion data from consumer facial mocap systems. At the core of our framework is an efficient regression model that maps mocap-derived blendshape parameters to the expression space of parametric face models, enabling direct control over Gaussian avatars without iterative optimization. Our design incorporates a lightweight expression regularization mechanism that improves stability and expressiveness by encouraging semantically disentangled, identity-specific deformations.

Through extensive evaluations—implemented using ARKit as the mocap source and FLAME as the target parametric model—we show that our method outperforms both fitting-based and regression-based baselines in animation quality and latency. The system supports both live streaming and offline reenactment, enabling efficient, real-time avatar control for virtual meetings and social telepresence.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2026-04-30T16:08:10Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2026-04-30T16:08:10Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents誌謝 ii
摘要 iii
ABSTRACT iv
CONTENTS v
LIST OF FIGURES vii
LIST OF TABLES viii
Chapter 1 Introduction 1
Chapter 2 Related Work 3
2.1 3D Morphable Models 3
2.2 Gaussian Head Avatars with 3DMMs 4
2.3 Mocap to Parametric Expression Mapping 6
Chapter 3 Method 7
3.1 Motivation 7
3.2 System Overview 7
3.3 Data Preparation 10
3.3.1 Motion Capture Input 10
3.3.2 FLAME Parameter Extraction 11
3.3.3 ARKit Sequence Acquisition 12
3.4 Expression Mapping 13
3.4.1 Problem Formulation 13
3.4.2 Model Variants 14
3.4.3 Expression Regularization 15
3.4.4 Training Objectives 18
3.4.5 Performance Analysis 19
Chapter 4 Experiment 21
4.1 Experimental Setup 21
4.1.1 Dataset 21
4.1.2 Evaluation Metrics 21
4.2 Quantitative Evaluation 23
4.2.1 Self Reenactment 23
4.2.2 Cross-identity Reenactment 25
4.2.3 Subject Generalization 28
4.3 Qualitative Evaluation 29
4.3.1 Self Reenactment 30
4.3.2 Cross-identity Reenactment 33
4.3.3 ARKit Blendshape Disentanglement 37
4.3.4 Discussion 38
Chapter 5 Conclusion and Future Work 40
5.1 Limitations and Future Work 40
5.2 Conclusion 41
References 43
Appendix A Detailed Per-subject Evaluation Results 46
A.1 Self Reenactment (Section 4.2.1) 46
A.2 Subject Generalization (Section 4.2.3) 48
-
dc.language.isoen-
dc.subject高斯潑濺-
dc.subject臉部動畫-
dc.subject臉部參數模型-
dc.subject三維臉部-
dc.subject即時虛擬人臉控制-
dc.subjectGaussian Splatting-
dc.subjectFacial Animation-
dc.subjectFacial Blendshapes-
dc.subject3D Head Avatar-
dc.subjectReal-time Avatar Control-
dc.title基於高斯潑濺之即時虛擬臉部動畫生成與控制zh_TW
dc.titleReal time Facial Animation Generation and Control via Gaussian Splattingen
dc.typeThesis-
dc.date.schoolyear114-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee葛如鈞;朱宏國;歐陽明;王碩仁zh_TW
dc.contributor.oralexamcommitteeJu-Chun Ko;Hung-Kuo Chu;Ming Ouhyoung;Shoue-Jen Wangen
dc.subject.keyword高斯潑濺,臉部動畫臉部參數模型三維臉部即時虛擬人臉控制zh_TW
dc.subject.keywordGaussian Splatting,Facial AnimationFacial Blendshapes3D Head AvatarReal-time Avatar Controlen
dc.relation.page49-
dc.identifier.doi10.6342/NTU202600916-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2026-04-13-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
dc.date.embargo-lift2026-05-01-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-114-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
24.1 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved