使用少量關鍵模型之三維對嘴語音動畫

Fu-Chung Huang; 黃輔中

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28854

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳炳宇(Bing-Yu Chen)
dc.contributor.author	Fu-Chung Huang	en
dc.contributor.author	黃輔中	zh_TW
dc.date.accessioned	2021-06-13T00:26:02Z	-
dc.date.available	2007-07-30
dc.date.copyright	2007-07-30
dc.date.issued	2007
dc.date.submitted	2007-07-25
dc.identifier.citation	[1] Alan H. Barr. Global and local deformations of solid primitives. In SIGGRAPH ’84: Proceedings of the 11th annual conference on Computer graphics and interactive techniques, pages 21–30, New York, NY, USA, 1984. ACM Press. [2] Volker Blanz and Thomas Vetter. A morphable model for the synthesis of 3d faces. In SIGGRAPH ’99: Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 187–194, New York, NY, USA, 1999. ACM Press/Addison-Wesley Publishing Co. [3] Matthew Brand. Voice puppetry. In SIGGRAPH ’99: Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 21–28, New York, NY, USA, 1999. ACM Press/Addison-Wesley Publishing Co. [4] Christoph Bregler, Michele Covell, and Malcolm Slaney. Video rewrite: driving visual speech with audio. In SIGGRAPH ’97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 353–360, New York, NY, USA, 1997. ACM Press/Addison-Wesley Publishing Co. [5] Ian Buck, Adam Finkelstein, Charles Jacobs, Allison Klein, David H. Salesin, Joshua Seims, Richard Szeliski, and Kentaro Toyama. Performance-driven handdrawn animation. In NPAR ’00: Proceedings of the 1st international symposium on Non-photorealistic animation and rendering, pages 101–108, New York, NY, USA, 2000. ACM Press. [6] Yong Cao, Petros Faloutsos, Eddie Kohler, and Fr´ed´eric Pighin. Real-time speech motion synthesis from recorded motions. In SCA ’04: Proceedings of the 2004 ACM SIGGRAPH/Eurographics symposium on Computer animation, pages 345– 353, New York, NY, USA, 2004. ACM Press. [7] Yao-Jen Chang and Tony Ezzat. Transferable videorealistic speech animation. In SCA ’05: Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on Computer animation, pages 143–151, New York, NY, USA, 2005. ACM Press. [8] Byoungwon Choe, Hanook Lee, and Hyeong-Seok Ko. Performance-driven musclebased facial animation. The Journal of Visualization and Computer Animation, (2):67–79, 2001. [9] Erika Chuang and Cris Bregler. Performance driven facial animation using blendshape interpolation. Technical report cs-tr-2002-02, Stanford CS, 2002. [10] Erika S. Chuang, Hrishikesh Deshpande, and Chris Bregler. Facial expression space learning. In PG ’02: Proceedings of the 10th Pacific Conference on Computer Graphics and Applications, page 68, Washington, DC, USA, 2002. IEEE Computer Society. [11] Zhigang Deng, Pei-Ying Chiang, Pamela Fox, and Ulrich Neumann. Animating blendshape faces by cross-mapping motion capture data. In SI3D ’06: Proceedings of the 2006 symposium on Interactive 3D graphics and games, pages 43–48, New York, NY, USA, 2006. ACM Press. [12] Zhigang Deng and Ulrich Neumann. efase: Expressive facial animation synthesis and editing with phoneme-isomap control. In ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA), pages 251–259, 2006. [13] Avram Robinson-Mosher Eftychios Sifakis, Andrew Selle and Ronald Fedkiw. Simulating speech with a physics-based facial muscle model. In ACM SIGGRAPH/ Eurographics Symposium on Computer Animation (SCA), pages 261–270, 2006. [14] P. Ekman and W. V. Friesen. Manual for the facial action coding system. 1977. [15] Tony Ezzat, Gadi Geiger, and Tomaso Poggio. Trainable videorealistic speech animation. In SIGGRAPH ’02: Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 388–398, New York, NY, USA, 2002. ACM Press. [16] Brendan J J. Frey and Delbert Dueck. Clustering by passing messages between data points. Science, January 2007. [17] B. Georgescu, I. Shimshoni, and P.Meer. Mean shift based clustering in high dimensions: A texture classification example. In International Conference on Computer Vision, pages 456–463, 2003. [18] Brian Guenter, Cindy Grimm, Daniel Wood, Henrique Malvar, and Fr´ed´eric Pighin. Making faces. In SIGGRAPH ’98: Proceedings of the 25th annual conference on Computer graphics and interactive techniques, pages 55–66, New York, NY, USA, 1998. ACM Press. [19] Doug L. James and Christopher D. Twigg. Skinning mesh animations. In SIGGRAPH ’05: ACM SIGGRAPH 2005 Papers, pages 399–407, New York, NY, USA, 2005. ACM Press. [20] Michael J. Jones and Tomaso Poggio. Multidimensional morphable models. In ICCV ’98: Proceedings of the Sixth International Conference on Computer Vision, page 683, Washington, DC, USA, 1998. IEEE Computer Society. [21] C. L. Lawson and R. J. Hanson. Solving Least Squares Problems. Prentice-Hall, 1974. [22] J. P. Lewis, JonathanMooser, Zhigang Deng, and Ulrich Neumann. Reducing blendshape interference by selected motion attenuation. In SI3D ’05: Proceedings of the 2005 symposium on Interactive 3D graphics and games, pages 25–29, New York, NY, USA, 2005. ACM Press. [23] Kyunggun Na and Moonryul Jung. Hierarchical retargetting of fine facial motions. Comput. Graph. Forum, 23(3):687–695, 2004. [24] Jun-Yong Noh and Ulrich Neumann. Expression cloning. In SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 277–288, New York, NY, USA, 2001. ACM Press. [25] Igor S. Pandzic and Robert Forchheimer. MPEG-4 Facial Animation: The Standard, Implementation and Applications. Wiley, 2002. [26] Sang Il Park and Jessica K. Hodgins. Capturing and animating skin deformation in human motion. ACM Transactions on Graphics (SIGGRAPH 2006), 25(3), August 2006. [27] Fr´ed´eric Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, and David H. Salesin. Synthesizing realistic facial expressions from photographs. In SIGGRAPH ’98: Proceedings of the 25th annual conference on Computer graphics and interactive techniques, pages 75–84, New York, NY, USA, 1998. ACM Press. [28] Fr´ed´eric Pighin and J. P. Lewis. Facial motion retargeting. Siggraph 2006 course notes performance-driven facial animation, ACM SIGGRAPH, 2006. [29] Hyewon Pyun, Yejin Kim,Wonseok Chae, Hyung Woo Kang, and Sung Yong Shin. An example-based approach for facial expression cloning. In SCA ’03: Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation, pages 167–176, Aire-la-Ville, Switzerland, Switzerland, 2003. Eurographics Association. [30] Ken Shoemake and Tom Duff. Matrix animation and polar decomposition. In Proceedings of the conference on Graphics interface ’92, pages 258–264, San Francisco, CA, USA, 1992. Morgan Kaufmann Publishers Inc. [31] Eftychios Sifakis, Igor Neverov, and Ronald Fedkiw. Automatic determination of facial muscle activations from sparse motion capture marker data. In SIGGRAPH ’05: ACM SIGGRAPH 2005 Papers, pages 417–425, New York, NY, USA, 2005. ACM Press. [32] Robert W. Sumner and Jovan Popovi´c. Deformation transfer for triangle meshes. In SIGGRAPH ’04: ACM SIGGRAPH 2004 Papers, pages 399–405, New York, NY, USA, 2004. ACM Press. [33] Robert W. Sumner, Matthias Zwicker, Craig Gotsman, and Jovan Popovi´c. Meshbased inverse kinematics. In SIGGRAPH ’05: ACM SIGGRAPH 2005 Papers, pages 488–495, New York, NY, USA, 2005. ACM Press. [34] Joshua B. Tenenbaum and William T. Freeman. Separating style and content with bilinear models. Neural Comput., 12(6):1247–1283, 2000. [35] Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popovi´c. Face transfer with multilinear models. ACM Trans. Graph., 24(3):426–433, 2005. [36] YangWang, Xiaolei Huang, Chan S. Lee, Song Zhang, Zhiguo Li, Dimitris Samaras, Dimitris Metaxas, Ahmed Elgammal, and Peisen Huang. High resolution acquisition, learning and transfer of dynamic 3-d facial expressions. In Computer Graphics Forum, 2004. [37] Lance Williams. Performance-driven facial animation. In SIGGRAPH ’90: Proceedings of the 17th annual conference on Computer graphics and interactive techniques, pages 235–242, New York, NY, USA, 1990. ACM Press. [38] Jin xiang Chai, Jing Xiao, and Jessica Hodgins. Vision-based control of 3d facial animation. In SCA ’03: Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation, pages 193–206, Aire-la-Ville, Switzerland, Switzerland, 2003. Eurographics Association. [39] Li Zhang, Noah Snavely, Brian Curless, and Steven M. Seitz. Spacetime faces: high resolution capture for modeling and animation. In SIGGRAPH ’04: ACM SIGGRAPH 2004 Papers, pages 548–558, New York, NY, USA, 2004. ACM Press.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28854	-
dc.description.abstract	傳統上對於大部分的應用，臉部動畫一般是被認為重要但很繁雜的工作，主要由於臉部的肌肉很複雜且互相牽連。雖然已經有不少方法被提出來減輕藝術家的負擔，但即使如此，在製作臉部動畫上仍然沒有一個計算迅速且省容量空間的方法被提出來。這篇論文提出了一個架構，給予任何一段語音，來生成相對應的對嘴臉部動畫。方法由追蹤訓練影像的臉部特徵開始，藉由特徵首先去辨別臉部關鍵模型，這些關鍵模型可以用來指導藝術家建立相對應的關鍵三維模型。訓練的影像接著被參數化到權重空間，經由臉部動作轉移以後，單音的權重機率模型可以藉由轉移的訓練模型學習到。這個架構可以再非常短的時間生成一段對嘴的語音動畫，並且需求非常少量的空間。生成的動畫可以保有訓練影像的動態特徵，使得虛擬角色可以像人一樣保有說話的動態。	zh_TW
dc.description.abstract	Facial animation is traditionally considered as important but tedious work for most applications, because the muscles on face are complex and dynamically interacting. Although there are several methods proposed to ease the burden from artists to create animating faces, non of these are fast and efficient in storage. This paper introduces a framework for synthesizing lips-sync facial animation given a speech sequence. Starting from tracking features on training videos, the method first find representative key-shapes that is important for both image reconstruction and guiding the artists to create corresponding 3D models. The training video is then parameterized to weighting space, or cross-mapping, then the dynamic of features on the face is learned for each kind of phoneme. The propose system can synthesis lips-sync 3D facial animation in very short time, and requires very small amount storage to keep information of the key-shape models and phoneme dynamics.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T00:26:02Z (GMT). No. of bitstreams: 1 ntu-96-R94725022-1.pdf: 1074341 bytes, checksum: fd9e17d357692ef538b9926bf27e74e8 (MD5) Previous issue date: 2007	en
dc.description.tableofcontents	1 Introduction 6 2 Related work 9 2.1 Low-level FA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Muscle based approaches . . . . . . . . . . . . . . . . . . . . . . 10 2.1.2 Interpolation based approaches . . . . . . . . . . . . . . . . . . . 10 2.2 High-level FA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Speech-driven . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Performance-driven . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.3 Geometry Transfer . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Preliminary 18 3.1 Animation Reconstruction from Scattered Data Observation . . . . . . . 18 3.2 Deformation Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Capture and Preprocessing 22 4.1 Feature Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2 Phoneme Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5 Algorithm 24 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2 Prototype Image-Model Pairs Identification . . . . . . . . . . . . . . . . 25 5.2.1 Affinity Propagation . . . . . . . . . . . . . . . . . . . . . . . . 26 5.3 Training Sample Parameterizations and Cross Mapping . . . . . . . . . . 29 5.4 Phoneme Space Construction and Trajectory Synthesis . . . . . . . . . . 33 6 Result 36 7 Conclusion and Future Work 38
dc.language.iso	en
dc.subject	語音	zh_TW
dc.subject	動畫:三維	zh_TW
dc.subject	對嘴	zh_TW
dc.subject	animation	en
dc.subject	3D	en
dc.subject	speech	en
dc.subject	lips-sync	en
dc.title	使用少量關鍵模型之三維對嘴語音動畫	zh_TW
dc.title	Lips-Sync 3D Speech Animation using Compact Key-Shapes	en
dc.type	Thesis
dc.date.schoolyear	95-2
dc.description.degree	碩士
dc.contributor.coadvisor	莊永裕(Yung-Yu Chuang)
dc.contributor.oralexamcommittee	吳健榕,梁容輝
dc.subject.keyword	對嘴,語音,動畫:三維,	zh_TW
dc.subject.keyword	lips-sync,speech,animation,3D,	en
dc.relation.page	44
dc.rights.note	有償授權
dc.date.accepted	2007-07-27
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-96-1.pdf 未授權公開取用	1.05 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。