Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32717Full metadata record
| ???org.dspace.app.webui.jsptag.ItemTag.dcfield??? | Value | Language |
|---|---|---|
| dc.contributor.advisor | 陳炳宇(Bing-Yu Chen) | |
| dc.contributor.author | Jun-Ze Huang | en |
| dc.contributor.author | 黃鈞澤 | zh_TW |
| dc.date.accessioned | 2021-06-13T04:14:04Z | - |
| dc.date.available | 2006-07-26 | |
| dc.date.copyright | 2006-07-26 | |
| dc.date.issued | 2006 | |
| dc.date.submitted | 2006-07-24 | |
| dc.identifier.citation | [Alexa 02] Alexa , M. 2002. Linear combination of transformations. ACM Transactions on Graphics 21,3(July),380-387
[Black 92] Black, M. J. Robust Incremental Optical Flow. PhD thesis, Yale University, 1992. [Bishop 95] Bishop, C. M. 1995. Neural Networks for Pattern Recognition. Clarendon Press, Oxford. [Bregler 97] Christoph Bregler, Malcolm Slaney, Michele Covell. Video Rewrite: Driving visual speech with audio. SIGGRAPH 1997. [Chai 03] Jin-Xiang Chai, Jing Xiao, Jessica Hodgins, 'Vision-based Control of 3D Facial Animation,' In the proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer animation [Cormen 89] Cormen , T. H., Leiserson, C. E., and Rivest, R. L. 1989. Introduction to Algorithms. The MIT Press and McGraw-Hill Book Company. [Ezzat 96] Tony Ezzat and Tomaso Poggio, 'Facial Analysis and Synthesis Using Image-Based Models, ' Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, Vermont, October 1996 [Ezzat 97] Tony Ezzat and Tomaso Poggio, 'Videorealistic Talking Faces : A Morphing Approach, ' Proceedings of the Audiovisual Speech Processing Workshop, Rhodes, Greece, September 1997 [Ezzat 99] T. Ezzat and T. Poggio. 'Visual speech synthesis by morphing visemes, ' In K. A. Publishers, editor, International Journal of Computer Vision, volume 38, pages 45--57, 2000. [Ezzat 02] T. Ezzat, G. Geiger, T. Poggio. 'Trainable videorealistic speech animation. ' ACM Trans. Graphics (also in Proc. SIGGRAPH’02) 21(3): 388 –398, 2002. [Hamlaoui 05] Soumya Hamlaoui, Frank Davoine. Facial Action Tracking Using An AAM-Based Condensation Approach. ICASSP 2005 [Madsen 04] Madsen, Kl, Nielsen, H., and Tingleff, O. 2004. Methods for nonlinear least squares problems. Tech. rep., Informatics and Mathematical Modeling , Technical University of Denmark [Roweis 98] Roweis, S. 1998. EM algorithms for PCA and SPCA. In Advances in Neural Information Processing Systems. The MIT Press, M.I Jordan, M.J.Kearns, and S.A.Solla, Eds.,vol.10 [Shoemake 92] Shoemake, K., and Duff, T. 1992. Matrix animation and polar decomposition. In Proceedings of Graphics Interface 92, 259-264. [Sphinx] http://cmusphinx.sourceforge.net/sphinx2/ [sphinx2] http://www.speech.cs.cmu.edu/tools/lmtool.html [Sumner 04] SUMNER, R. W., POPOVI´C, J. Deformation Transfer for Triangle Meshes. SIGGRAPH 2004. [Sumner 05] SUMNER, R. W., ZWICKER, M., GOTSMAN, C., AND POPOVI´C, J. 2005.Mesh-based inverse kinematics. ACM Transactions on Graphics 24, 3(Aug.), 488–495. [Vlasic 05] Daniel Vlasic, Matthew Brand, Hanspeter Pfister, Jovan Popović, 'Face Transfer with Multilinear Models, ' ACM Transactions on Graphics 24(3), 2005 [Wolberg 90] Wolberg, G. 1990. Digital Image Warping. IEEE Computer Society Press, Los Alamitos, CA. [Zhang 04] Li Zhang and Noah Snavely and Brian Curless and Steven M. Seitz, Spacetime Faces: High-Resolution Capture for Modeling and Animation. ACM Annual Conference on Computer Graphics. 2004(August), 548-558. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32717 | - |
| dc.description.abstract | 要創造一段講特定語音的3維人臉動畫是很困難的。即使是對一個專業的動畫師而言, 也要花費許多的時間。我們的研究提供了一個語音驅動的3維人臉動畫系統,可以讓使用者輕易地產生人臉動畫。使用者只要給一段語音當成輸入,我們的系統就會輸出一段講輸入語音的3維人臉動畫
我們的系統主要分為三個部份: 第一個是MMM (multidimensional morphable model).MMM主要是利用機械學習的方式,從訓練影片所建立的模型。我們使用MMM來產生對應輸入語音的真實語音影片。 第二個部份是臉部追蹤。臉部追蹤可以從合成的語音影片中找出在人臉上的特徵點所在位置。 第三個部份是Mesh-IK(mesh based inverse kinematics).Mesh-IK以特徵點的移動當成指導方針來變形3維人臉模型,並且使得產生的模型相似於對應的語音影片的影格。所以我們可以輸出一段3維的人臉動畫。 臉部追蹤和Mesh-IK也可把真實的語音影片或表情影片當成輸入,然後產生對應的語音或是表情的人臉動畫。 | zh_TW |
| dc.description.abstract | It is often difficult to animate a face model speaking a specific speech. Even for professional animators, it will take a lot of time. Our work provides a speech-driven 3D facial animation system which allows the user to easily generate facial animations. The user only needs to give a speech as the input. The output will be a 3D facial animation relative to the input speech.
Our work can be divided into three sub-systems: One is the MMM (multidimensional morphable model). MMM is build from the pre-recorded training video using machine learning techniques. We can use MMM to generate realistic speech video respect to the input speech. The second part is Facial Tracking. Facial Tracking can extract the feature points of a human subject in the synthetic speech video. The third part is Mesh-IK (mesh based inverse kinematics). Mesh-IK takes the motion of feature points as the guide line to deform 3D face models, and makes the result model have the same looking in the corresponding frame of the speech video. Thus we can have a 3D facial animation as the output. Facing Tracking and Mesh-IK can also take a real speech video or even a real expression video as the input, and produce the corresponding facial animations. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-13T04:14:04Z (GMT). No. of bitstreams: 1 ntu-95-R93725012-1.pdf: 2544573 bytes, checksum: 197e308880e65569cfdb59e5f67dada1 (MD5) Previous issue date: 2006 | en |
| dc.description.tableofcontents | 1. Introduction 9
2. Related Work 11 3. System Overview 17 4. MMM 19 4.1 Corpus Recording 19 4.2 Pre-Processing 19 4.3 Building a MMM 21 4.3.1 PCA 22 4.3.2 K-means Clustering 22 4.3.3 Dijkstra 24 4.4 MMM Synthesis 25 4.5 Analysis 26 4.6 Trajectory Synthesis 29 4.7 Post-Processing 31 5 Facial Tracking 35 6 MeshIK 41 6.1 Feature Vectors 41 6.2 Linear Feature Space 45 6.3 Nonlinear Feature Space 45 7. Result 49 7.2 Synthetic speech video driven facial animation2 51 7.3 Real speech video driven facial animation 52 7.4 Real expression video driven facial animation 54 8. Conclusion & Future Work 57 8.1 Conclusion 57 8.2 Future Work 57 9. Reference 58 | |
| dc.language.iso | en | |
| dc.subject | 追蹤 | zh_TW |
| dc.subject | 臉部動畫 | zh_TW |
| dc.subject | 語音 | zh_TW |
| dc.subject | tracking | en |
| dc.subject | facial animation | en |
| dc.subject | speech | en |
| dc.title | 語音驅動之3維人臉動畫 | zh_TW |
| dc.title | Speech-Driven 3D Facial Animation | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 94-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.coadvisor | 莊永裕(Yung-Yu Chuang) | |
| dc.contributor.oralexamcommittee | 林文杰(Wen-Chieh Lin),林奕成(I-Chen Lin) | |
| dc.subject.keyword | 語音,臉部動畫,追蹤, | zh_TW |
| dc.subject.keyword | speech,facial animation,tracking, | en |
| dc.relation.page | 60 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2006-07-25 | |
| dc.contributor.author-college | 管理學院 | zh_TW |
| dc.contributor.author-dept | 資訊管理學研究所 | zh_TW |
| Appears in Collections: | 資訊管理學系 | |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-95-1.pdf Restricted Access | 2.48 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
