利用動作元素主導模式之角色對嘴動畫

Yu-Mei Chen; 陳裕美

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46594

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳炳宇(Bing-Yu Chen)
dc.contributor.author	Yu-Mei Chen	en
dc.contributor.author	陳裕美	zh_TW
dc.date.accessioned	2021-06-15T05:17:41Z	-
dc.date.available	2010-07-22
dc.date.copyright	2010-07-22
dc.date.issued	2010
dc.date.submitted	2010-07-21
dc.identifier.citation	[1] V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. In ACM SIGGRAPH 1999 Conference Proceedings, pages 187–194, 1999. [2] M. Brand. Voice puppetry. In ACM SIGGRAPH 1999 Conference Proceedings, pages 21–28, 1999. [3] C. Bregler, M. Covell, and M. Slaney. Video rewrite: driving visual speech with audio. In ACM SIGGRAPH 1997 Conference Proceedings, pages 353–360, 1997. [4] I. Buck, A. Finkelstein, C. Jacobs, A. Klein, D. H. Salesin, J. Seims, R. Szeliski, and K. Toyama. Performance-driven hand-drawn animation. In Proceedings of the 2002 International Symposium on Non-Photorealistic Animation and Rendering, pages 101–108, 2000. [5] Y. Cao, P. Faloutsos, E. Kohler, and F. Pighin. Real-time speech motion synthesis from recorded motions. In Proceedings of the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 345–353, 2004. [6] J. Chai, J. Xiao, and J. Hodgins. Vision-based control of 3d facial animation. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 193–206, 2003. [7] Y.-J. Chang and T. Ezzat. Transferable videorealistic speech animation. In Proceed- ings of the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Anima- tion, pages 143–151, 2005. [8] B. Choe, H. Lee, and H.-S. Ko. Performance-driven muscle-based facial animation. The Journal of Visualization and Computer Animation, (2):67–79, 2001. [9] E. Chuang and C. Bregler. Mood swings: expressive speech animation. ACM Transactions on Graphics, 24(2):331–347, 2005. [10] E. S. Chuang, H. Deshpande, and C. Bregler. Facial expression space learning. In Pacific Graphics 2002 Conference Proceedings, pages 68–76, 2002. [11] M. M. Cohen and D. W. Massaro. Modeling coarticulation in synthetic visual speech. In Computer Animation 1993 Conference Proceedings, pages 139–156, 1993. [12] P. Cosi, E. M. Caldognetto, G. Perin, and C. Zmarich. Labial coarticulation modeling for realistic facial animation. In Proceedings of the 2002 IEEE International Conference on Multimodal Interfaces, pages 505–510, 2002. [13] P. E. Debevec and J. Malik. Recovering high dynamic range radiance maps from photographs. [14] Z. Deng, P.-Y. Chiang, P. Fox, and U. Neumann. Animating blendshape faces by cross-mapping motion capture data. In Proceedings of the 2006 Symposium on Interactive 3D Graphics and Games, pages 43–48, 2006. [15] Z. Deng and U. Neumann. efase: Expressive facial animation synthesis and editing with phoneme-level controls. In In Proc. of ACM SIGGGRAPH/Eurographics Symposium on Computer Animation, pages 251–259. Eurographics Association, 2006. [16] Z. Deng and U. Neumann. Data-Driven 3D Facial Animation. Springer, 2008. [17] Z. Deng, U. Neumann, J. Lewis, T.-Y. Kim, M. Bulut, and S. Narayanan. Expressive facial animation synthesis by learning speech coarticulation and expression spaces. IEEE Transactions on Visualization and Computer Graphics, 12(6):1523– 1534, 2006. [18] P. Ekman andW. V. Friesen. Manual for the Facial Action Coding System. Consulting Psychologist Press, 1977. [19] T. Ezzat, G. Geiger, and T. Poggio. Trainable videorealistic speech animation. ACM Transactions on Graphics, 21(3):388–398, 2002. (SIGGRAPH 2002 Conference Proceedings). [20] B. J. Frey and D. Dueck. Clustering by passing messages between data points. Science, 315(5814):972–976, 2007. [21] B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin. Making faces. In ACM SIGGRAPH 1998 Conference Proceedings, pages 55–66, 1998. [22] X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld. The sphinx-ii speech recognition system: An overview. Computer Speech and Language, 7(2):137–148, 1993. [23] L. J. Ju, Eunjung. Expressive facial gestures from motion capture data. Computer Graphics Forum, 27(2):381–388, 2008. (Eurographics 2008 Conference Proceedings). [24] I.-J. Kim and H.-S. Ko. 3d lip-synch generation with data-faithful machine learning. Computer Graphics Forum, 26(3):295–301, 2007. (Eurographics 2007 Conference Proceedings). [25] C. L. Lawson and R. J. Hanson. Solving Least Squares Problems. Prentice-Hall, 1974. [26] J. P. Lewis, J. Mooser, Z. Deng, and U. Neumann. Reducing blendshape interference by selected motion attenuation. In Proceedings of the 2005 Symposium on Interactive 3D Graphics and Games, pages 25–29, 2005. [27] A. Lofqvist. Speech Production and Speech Modeling, chapter Speech as audible gestures, pages 289–322. Kluwer Academic Print on Demand, 1990. [28] W.-C. Ma, A. Jones, J.-Y. Chiang, T. Hawkins, S. Frederiksen, P. Peers, M. Vukovic, M. Ouhyoung, and P. Debevec. Facial performance synthesis using deformationdriven polynomial displacement maps. ACM Transactions on Graphics, 27(5):1–10, 2008. (SIGGRAPH Asia 2008 Conference Proceedings). [29] K. Madsen, H. B. Nielsen, and O. Tingleff. Methods for non-linear least squares problems. Technical report, Technical University of Denmark, 2004. [30] K. Na and M. Jung. Hierarchical retargetting of fine facial motions. Computer Graphics Forum, 23(3):687–695, 2004. (Eurographics 2004 Conference Proceedings). [31] J.-Y. Noh and U. Neumann. Expression cloning. In ACM SIGGRAPH 2001 Conference Proceedings, pages 277–288, 2001. [32] F. I. Parke and K. Waters. Computer Facial Animation, 2nd Ed. AK Peters, 2008. [33] F. Pighin and J. P. Lewis. Performance-driven facial animation: Introduction. In ACM SIGGRAPH 2006 Conference Course Notes, 2006. [34] H. Pyun, Y. Kim, W. Chae, H. W. Kang, and S. Y. Shin. An example-based approach for facial expression cloning. In Proceedings of the 2003 ACM SIGGRAPH/ Eurographics Symposium on Computer Animation, pages 167–176, 2003. [35] N. M.-T. S. Kshirsagar. Visyllable based speech animation. Computer Graphics Forum, 22(3), 2003. (Eurographics 2003 Conference Proceedings). [36] E. Sifakis, I. Neverov, and R. Fedkiw. Automatic determination of facial muscle activations from sparse motion capture marker data. ACM Transactions on Graphics, 24(3):417–425, 2005. (SIGGRAPH 2005 Conference Proceedings). [37] E. Sifakis, A. Selle, A. Robinson-Mosher, and R. Fedkiw. Simulating speech with a physics-based facial muscle model. In Proceedings of the 2006 ACM SIGGRAPH/ Eurographics Symposium on Computer Animation, pages 261–270, 2006. [38] R. W. Sumner and J. Popovi’c. Deformation transfer for triangle meshes. ACM Transactions on Graphics, 23(3):399–405, 2004. (SIGGRAPH 2004 Conference Proceedings). [39] D. Vlasic, M. Brand, H. Pfister, and J. Popovi’c. Face transfer with multilinear models. ACM Transactions on Graphics, 24(3):426–433, 2005. (SIGGRAPH 2005 Conference Proceedings). [40] K. Wampler, D. Sasaki, L. Zhang, and Z. Popovi’c. Dynamic, expressive speech animation from a single mesh. In Proceedings of the 2007 ACM SIGGRAPH/ Eurographics Symposium on Computer Animation, pages 53–62, 2007. [41] Y. Wang, X. Huang, C.-S. Lee, S. Zhang, Z. Li, D. Samaras, D. Metaxas, A. Elgammal, and P. Huang. High resolution acquisition, learning and transfer of dynamic 3-d facial expressions. Computer Graphics Forum, 23(3):677–686, 2004. (Eurographics 2004 Conference Proceedings). [42] X. M. Zhigang Deng. Perceptually guided expressive facial animation. Symposium on Computer Animation, pages 67–76, 2008. (Proceedings of the 2008 ACM SIGGRAPH/ Eurographics Symposium on Computer Animation).
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46594	-
dc.description.abstract	說話動畫在傳統上被視為一個相當重要卻極富困難度的研究主題，並且由於臉部肌肉間複雜的肌理結構與快速的變化，使得對嘴動畫更具挑戰性。截至目前為止有許多對嘴動畫的相關研究已經被提出，但其中並沒有很快速而且有效率的方法。在本論文中我們提出了一個有效率的機制：針對指定的角色模型，給予聲音和台詞來生成對嘴動畫。在本系統中使用動畫控制信號作為訓練資料，首先將訓練資料分群並個別利用最大期望演算法的方式學習出動作元素主導模式(dominated animeme model)，此動作元素主導模式分為兩部分：一為多項式型態的動畫元素，另一為相對應的高斯函數，主要用來模擬協同構音的相互影響。最後，給定聲音與台詞，即可運用動作元素主導模式來生成新的動畫控制信號已達到對嘴動畫的效果。本論文的結果能保留角色模型本身的形狀特色，並且由於生成動畫控制信號所花費的時間接近即時，此項技術能夠廣泛的使用在對嘴動畫的樣板、多國語言對嘴動畫、大量的動畫製作等應用。	zh_TW
dc.description.abstract	Speech animation is traditionally considered as important but tedious work for most applications, especially when taking lip synchronization (lip-sync) into consideration, because the muscles on the face are complex and interact dynamically. Although there are several methods proposed to ease the burden on artists to create facial and speech animation, almost none are fast and efficient. In this thesis, we introduce a framework for synthesizing lip-sync character speech animation from a given speech sequence and its corresponding text. Starting from clustering the training data and training the dominated animeme models for every group in each kind of phoneme by learning the animation control signals of the character through an EM-style optimization approach, and further decomposing the dominated animeme models to the polynomial-fitted animeme models and corresponding dominance functions while taking coarticulation into account. Finally, given a novel speech sequence and its corresponding text, a lip-sync character animation can be synthesized in a very short time with the dominated animeme models. The synthesized lip-sync animation can even preserve exaggerated characteristics of the character’s facial geometry. Moreover, since our method can synthesize an acceptable and robust lip-sync animation in almost realtime, it can be used for many applications, such as lip-sync animation prototyping, multilingual animation reproduction, avatar speech, mass animation production, etc.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T05:17:41Z (GMT). No. of bitstreams: 1 ntu-99-R97922066-1.pdf: 21642760 bytes, checksum: bf0cbadb2aa4ddeb5798ededbf6f3d74 (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	Abstract 3 1 Introduction 11 2 RelatedWork 15 2.1 Facial Animation and Modeling 15 2.2 Lip-Sync Speech Animation 16 3 System Overview 21 4 Data Collection and Cross-Mapping 25 4.1 Data Collection 25 4.2 Cross-mapping 26 5 Dominated Animeme Model 29 5.1 Animeme Clustering 30 5.2 Animeme Modeling 31 5.3 Dominance Function 34 5.4 Dominated Animeme Model Construction 36 5.5 Dominated Animeme Model Selection 36 5.6 Dominated Animeme Model Synthesis 38 6 Experimental Results and Discussion 39 7 Conclusion and FutureWork 51 Bibliography 53
dc.language.iso	en
dc.subject	臉部動畫	zh_TW
dc.subject	對嘴動畫	zh_TW
dc.subject	說話動畫	zh_TW
dc.subject	speech animation	en
dc.subject	facial animation	en
dc.subject	lip-sync speech animation	en
dc.title	利用動作元素主導模式之角色對嘴動畫	zh_TW
dc.title	Animating Lip-Sync Characters with Dominated Animeme Models	en
dc.type	Thesis
dc.date.schoolyear	98-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	林文杰(Wen-Chieh Lin),林奕成(I-Chen Lin)
dc.subject.keyword	對嘴動畫,說話動畫,臉部動畫,	zh_TW
dc.subject.keyword	speech animation,lip-sync speech animation,facial animation,	en
dc.relation.page	57
dc.rights.note	有償授權
dc.date.accepted	2010-07-21
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 未授權公開取用	21.14 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。