可置換之語音驅動唇形合成方法

Hong-Dien Chen; 陳宏典

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/31767

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊永裕
dc.contributor.author	Hong-Dien Chen	en
dc.contributor.author	陳宏典	zh_TW
dc.date.accessioned	2021-06-13T03:19:39Z	-
dc.date.available	2006-08-01
dc.date.copyright	2006-08-01
dc.date.issued	2006
dc.date.submitted	2006-07-28
dc.identifier.citation	[KINGS05] Scott A. King, Richard E. Parent, “Creating Speech-Synchronized Animation”, IEEE Transactions on Visualization and Computer Graphics, vol. 11, no. 3, pp. 341-352, May/June 2005. [CHAI03] Jin-Xiang Chai, Jing Xiao, Jessica Hodgins, “Vision-based Control of 3D Facial Animation”, proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer animation. [Hiwada03] Kazuhiro Hiwada, Atsuto Maki, Akiko Nakashima, “Mimicking Video:Real-Time Morphable 3D Model Fitting”, proceedings of the ACM symposium on Virtual reality software and technology, 2003. [Ezzat96] Tony Ezzat and Tomaso Poggio, “Facial Analysis and Synthesis Using Image-Based Models”, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, Vermont, October 1996. [Ezzat97] Tony Ezzat and Tomaso Poggio, “Videorealistic Talking Faces : A Morphing Approach”, Proceedings of the Audiovisual Speech Processing Workshop, Rhodes, Greece, September 1997. [Ezzat98] Tony Ezzat and Tomaso Poggio, “MikeTalk : A Talking Facial Display Based on Morphing Visemes”, Proceedings of the Computer Animation Conference Philadelphia, PA, June 1998. [Ezzat99] Tony Ezzat and Tomaso Poggio, “Visual speech synthesis by morphing visemes”, In K. A. Publishers, editor, International Journal of Computer Vision, volume 38, pages 45--57, 2000. [Ezzat02] T. Ezzat, G. Geiger, T. Poggio, “Trainable videorealistic speech animation”, ACM Trans. Graphics (also in Proc. SIGGRAPH'02) 21(3): 388 -398, 2002. [Ezzat03] T. Ezzat, G. Geiger, T. Poggio, “Perceptual Evaluation of Video-Realistic Speech”, CBCL Paper #224/ AI Memo #2003-003, Massachusetts Institute of Technology, Cambridge, MA, February 2003. [Ezzat05] Yao-Jen Chang and Tony Ezzat, “Transferable Videorealistic Speech Animation”, ACM Siggraph/Eurographics Symposium on Computer Animation, Los Angeles, CA 2005. [Cosker04] Darren Cosker, Susan Paddock, David Marshall, Paul. L. Rosin, Simon Rushton, “Towards Perceptually Realistic Talking Heads: Models”, In the Proceedings of the 1st Symposium on Applied perception in graphics and visualization, 2004. [Cao03] Yong Cao, Petros Faloutsos, and Frederic Pighin, “Unsupervised Learning for Speech Motion Editing”, Proceedings of the ACM SIGGRAPH Symposium on Computer Animation 2003. [Cao04] Yong Cao, Petros Faloutsos, Eddie Kohler, and Frederic Pighin, “Real-time Speech Motion Synthesis from Recorded Motions”, Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2004. [Vlasic05] Daniel Vlasic, Matthew Brand, Hanspeter Pfister, Jovan Popovic, “Face Transfer with Multilinear Models”, ACM Transactions on Graphics 24(3), 2005. [Itti03] Laurent Itti, Nittin Dhavale, and Frederic Pighin, “Realistic Avatar Eye and Head Animation Using a Neurobiological Model of Visual Attention”, Proc. SPIE 48th Annual International Symposium on Optical Science and Technology, Aug 2003. [BLAN99] V. Blanz and T. Vetter, “A Morphable Model for the Synthesis of 3D Faces”, SIGGRAPH 1999, pp187-194, 1999. [BLAN03] V. Blanz, C. Basso, T. Poggio, and T. Vetter, “Reanimating faces in images and video”, Computer Graphics Forum Eurographics 2003 Conference Proceedings, Vol. 22, No. 3, p.641 - p.650, 2003.. [BRAN99] M. Brand, “Voice puppetry”, Proc. SIGGRAPH' 99, pp.21-28, 1999. [BREG97] Christoph Bregler, Malcolm Slaney, Michele Covell, “Video Rewrite: Speaking Through the Mouths of Others”, SIGGRAPH 1997, 1997. [COHE93] M. M. Cohen, D.W. Massaro, “Modeling co-articulation in synthetic visual speech”, Models and Techniques in Computer Animation. Springer-Verlag press, pp.139-156, 1993. [COTE98] G. Cote, B. Erol, M. Gallant, F. Kossentini, “Video coding at low bit rates”, IEEE Transactions on Circuit and Systems for Video Technology, Vol. 8, No. 7, 1998. [EPF] “3-D Facial Reconstruction Uncalibrated Image Sequences”, http://vrlab.epfl.ch/research/V_head_modeling.html [GUEN98] B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin, “Making faces”, ACM SIGGRAPH 1998 Conference Proceedings, p.55 - p.66, 1998. [ISO97 ISO/IEC JTC1/SC29/WG11 N1902] Text for CD 14496-2 Video, November 1997. [KALB01] G. A. Kalberer GA, L.V. Gool, “Face animation based on Observed 3D speech dynamics”, Proc. Computer Animation 2001, IEEE Computer Society, pp.18-24, Seoul, 2001. [KAMP97] Markus Kampmann and Jorn Ostermann, “Automatic adaptation of a face model in a layered coder with an object-based analysis-synthesis layer and a knowledge-based layer”, Signal Processing: Image Communication, 9, pp. 201-220, 1997. [KSHI03] S. Kshirsagar, N. Magnenat-Thalmann, “Visyllable based speech animation”, Computer Graphics Forum 22(3), p.631-p.639, 2003. [MART97] Geovanni Martinez, “Shape estimation of articulated 3D objects for object-based analysis-synthesis coding (OBASC)”, Signal Processing: Image Communication, 9, pp. 175-199, 1997. [OSTE94] Jorn Ostermann, “Object-based analysis-synthesis coding based on the source model of moving rigid 3D objects”, Signal Processing: Image Communication, 6, pp. 143-161, 1994. [PIGH98] Frederic Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, David H. Salesin, “Synthesizing Realistic Facial Expressions from Photographs”, Proceedings of SIGGRAPH'98, 1998. [WANG05] Jue Wang, Michael F. Cohen, “Very Low Frame-Rate Video Streaming For Face-To-Face Teleconference”, In Proceedings of Data Compression Conference (DCC'05), to appear, 2005. [WATE87] K. Waters, “A muscle model for animating three dimensional facial expression”, ACM Computer Graphics (SIGGRAPH 1987 Conference Proceedings), Vol. 21, No. 4, p.17 - p.24, 1987. [WEN04] Wen, Z. C. Liu, M. Cohen, J. Li, K. Zheng, T. Huang, “Low Bit-rate Video Streaming for Face-to-Face Teleconference”, In Proceedings of IEEE International Conference on Multimedia and Expo, 2004. [ZHAN97] Liang Zhang, “Tracking a face for knowledge-based coding of videophone sequences”, Signal Processing: Image Communication, 10, pp. 93-114, 1997. [ZHAN04] Li Zhang, Keith Noah Snavely, Brian Curless, Steve M. Seitz, “Spacetime Faces: High-Resolution Capture for Modeling and Animation”, SIGGRAPH 2004, 2004.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/31767	-
dc.description.abstract	以影像為基礎的人臉動畫技術已經達到很高的真實度，它可被應用在低頻寬的視訊會議或是在語言學習上扮演虛擬教師角色。可是以影像為基礎的人臉動畫技術需要先為特定使用者拍攝一段五到十分鐘的訓練影片，並加以分析來建立模型以產生動畫，這樣會限制它的應用。我們提出了一個簡單的方法，新使用者只需拍攝幾張特定的影像並利用原先使用者所建立的模型，就能產生出新的人臉動畫。	zh_TW
dc.description.abstract	Image-based videorealistic speech animation achieves significant visual realism such that it can be potentially used for creating virtual teachers in language learning, digital characters in movies, or even user’s representatives in video conferencing under very low bit-rate. However, it comes at the cost of the collection of a large video corpus from the specific person to be animated. This requirement hinders its use in broad applications, since a large video corpus for a specific person under a controlled recording setup may not be easily obtained. Hence, we adopt a simply method which allows us to transfer original animation model to a novel person only with a few different lip images.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T03:19:39Z (GMT). No. of bitstreams: 1 ntu-95-R93944015-1.pdf: 2012123 bytes, checksum: 6116fe41ad2e390d30c21aca4b515e2a (MD5) Previous issue date: 2006	en
dc.description.tableofcontents	CHAPTER 1 INTRODUCTION 12 CHAPTER 2 RELATED WORK 15 2.1. FACIAL CODING 15 2.2. MODEL-BASED FACIAL VIDEO SYNTHESIS 16 2.3. IMAGE-BASED FACIAL VIDEO SYNTHESIS 23 CHAPTER 3 BACKGROUND: TRAINABLE VIDEOREALISTIC SPEECH ANIMATION 29 3.1. CORPUS 30 3.2. PRE-PROCESSING 30 3.3. MULTIDIMENSIONAL MORPHABLE MODELS 31 3.3.1. MMM Construction 31 3.3.2. Synthesis 32 3.3.3. Analysis 32 3.4. PHONEME MODELS 33 3.4.1. Phoneme Models Construction 33 3.4.2. Trajectory Synthesis 34 3.4.3. Training 35 3.5. POST-PROCESSING 35 CHAPTER 4 MODEL TRANSFER 36 4.1. INITIALIZATION 38 4.2. FLOW MATCHING 40 4.3. TEXTURE MATCHING 41 4.4. ANALYSIS AND SYNTHESIS 42 CHAPTER 5 EXPERIMENTAL RESULTS 43 CHAPTER 6 DISCUSSIONS AND FUTURE WORK 45 CHAPTER 7 AN APPLICATION EXAMPLE 46 REFERENCE 61
dc.language.iso	en
dc.title	可置換之語音驅動唇形合成方法	zh_TW
dc.title	Transferable Speech-Driven Lips Synthesis	en
dc.type	Thesis
dc.date.schoolyear	94-2
dc.description.degree	碩士
dc.contributor.coadvisor	陳炳宇
dc.contributor.oralexamcommittee	林文杰,林奕成
dc.subject.keyword	人臉動畫,	zh_TW
dc.subject.keyword	speech animation,	en
dc.relation.page	64
dc.rights.note	有償授權
dc.date.accepted	2006-07-30
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-95-1.pdf 目前未授權公開取用	1.96 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。