Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46594
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳炳宇(Bing-Yu Chen)
dc.contributor.authorYu-Mei Chenen
dc.contributor.author陳裕美zh_TW
dc.date.accessioned2021-06-15T05:17:41Z-
dc.date.available2010-07-22
dc.date.copyright2010-07-22
dc.date.issued2010
dc.date.submitted2010-07-21
dc.identifier.citation[1] V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. In ACM
SIGGRAPH 1999 Conference Proceedings, pages 187–194, 1999.
[2] M. Brand. Voice puppetry. In ACM SIGGRAPH 1999 Conference Proceedings,
pages 21–28, 1999.
[3] C. Bregler, M. Covell, and M. Slaney. Video rewrite: driving visual speech with
audio. In ACM SIGGRAPH 1997 Conference Proceedings, pages 353–360, 1997.
[4] I. Buck, A. Finkelstein, C. Jacobs, A. Klein, D. H. Salesin, J. Seims, R. Szeliski, and
K. Toyama. Performance-driven hand-drawn animation. In Proceedings of the 2002
International Symposium on Non-Photorealistic Animation and Rendering, pages
101–108, 2000.
[5] Y. Cao, P. Faloutsos, E. Kohler, and F. Pighin. Real-time speech motion synthesis
from recorded motions. In Proceedings of the 2004 ACM SIGGRAPH/Eurographics
Symposium on Computer Animation, pages 345–353, 2004.
[6] J. Chai, J. Xiao, and J. Hodgins. Vision-based control of 3d facial animation. In
Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer
Animation, pages 193–206, 2003.
[7] Y.-J. Chang and T. Ezzat. Transferable videorealistic speech animation. In Proceed-
ings of the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Anima-
tion, pages 143–151, 2005.
[8] B. Choe, H. Lee, and H.-S. Ko. Performance-driven muscle-based facial animation.
The Journal of Visualization and Computer Animation, (2):67–79, 2001.
[9] E. Chuang and C. Bregler. Mood swings: expressive speech animation. ACM Transactions
on Graphics, 24(2):331–347, 2005.
[10] E. S. Chuang, H. Deshpande, and C. Bregler. Facial expression space learning. In
Pacific Graphics 2002 Conference Proceedings, pages 68–76, 2002.
[11] M. M. Cohen and D. W. Massaro. Modeling coarticulation in synthetic visual
speech. In Computer Animation 1993 Conference Proceedings, pages 139–156,
1993.
[12] P. Cosi, E. M. Caldognetto, G. Perin, and C. Zmarich. Labial coarticulation modeling
for realistic facial animation. In Proceedings of the 2002 IEEE International
Conference on Multimodal Interfaces, pages 505–510, 2002.
[13] P. E. Debevec and J. Malik. Recovering high dynamic range radiance maps from
photographs.
[14] Z. Deng, P.-Y. Chiang, P. Fox, and U. Neumann. Animating blendshape faces by
cross-mapping motion capture data. In Proceedings of the 2006 Symposium on Interactive
3D Graphics and Games, pages 43–48, 2006.
[15] Z. Deng and U. Neumann. efase: Expressive facial animation synthesis and editing
with phoneme-level controls. In In Proc. of ACM SIGGGRAPH/Eurographics Symposium
on Computer Animation, pages 251–259. Eurographics Association, 2006.
[16] Z. Deng and U. Neumann. Data-Driven 3D Facial Animation. Springer, 2008.
[17] Z. Deng, U. Neumann, J. Lewis, T.-Y. Kim, M. Bulut, and S. Narayanan. Expressive
facial animation synthesis by learning speech coarticulation and expression
spaces. IEEE Transactions on Visualization and Computer Graphics, 12(6):1523–
1534, 2006.
[18] P. Ekman andW. V. Friesen. Manual for the Facial Action Coding System. Consulting
Psychologist Press, 1977.
[19] T. Ezzat, G. Geiger, and T. Poggio. Trainable videorealistic speech animation. ACM
Transactions on Graphics, 21(3):388–398, 2002. (SIGGRAPH 2002 Conference
Proceedings).
[20] B. J. Frey and D. Dueck. Clustering by passing messages between data points.
Science, 315(5814):972–976, 2007.
[21] B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin. Making faces. In ACM
SIGGRAPH 1998 Conference Proceedings, pages 55–66, 1998.
[22] X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld. The
sphinx-ii speech recognition system: An overview. Computer Speech and Language,
7(2):137–148, 1993.
[23] L. J. Ju, Eunjung. Expressive facial gestures from motion capture data. Computer
Graphics Forum, 27(2):381–388, 2008. (Eurographics 2008 Conference Proceedings).
[24] I.-J. Kim and H.-S. Ko. 3d lip-synch generation with data-faithful machine learning.
Computer Graphics Forum, 26(3):295–301, 2007. (Eurographics 2007 Conference
Proceedings).
[25] C. L. Lawson and R. J. Hanson. Solving Least Squares Problems. Prentice-Hall,
1974.
[26] J. P. Lewis, J. Mooser, Z. Deng, and U. Neumann. Reducing blendshape interference
by selected motion attenuation. In Proceedings of the 2005 Symposium on
Interactive 3D Graphics and Games, pages 25–29, 2005.
[27] A. Lofqvist. Speech Production and Speech Modeling, chapter Speech as audible
gestures, pages 289–322. Kluwer Academic Print on Demand, 1990.
[28] W.-C. Ma, A. Jones, J.-Y. Chiang, T. Hawkins, S. Frederiksen, P. Peers, M. Vukovic,
M. Ouhyoung, and P. Debevec. Facial performance synthesis using deformationdriven
polynomial displacement maps. ACM Transactions on Graphics, 27(5):1–10,
2008. (SIGGRAPH Asia 2008 Conference Proceedings).
[29] K. Madsen, H. B. Nielsen, and O. Tingleff. Methods for non-linear least squares
problems. Technical report, Technical University of Denmark, 2004.
[30] K. Na and M. Jung. Hierarchical retargetting of fine facial motions. Computer
Graphics Forum, 23(3):687–695, 2004. (Eurographics 2004 Conference Proceedings).
[31] J.-Y. Noh and U. Neumann. Expression cloning. In ACM SIGGRAPH 2001 Conference
Proceedings, pages 277–288, 2001.
[32] F. I. Parke and K. Waters. Computer Facial Animation, 2nd Ed. AK Peters, 2008.
[33] F. Pighin and J. P. Lewis. Performance-driven facial animation: Introduction. In
ACM SIGGRAPH 2006 Conference Course Notes, 2006.
[34] H. Pyun, Y. Kim, W. Chae, H. W. Kang, and S. Y. Shin. An example-based
approach for facial expression cloning. In Proceedings of the 2003 ACM SIGGRAPH/
Eurographics Symposium on Computer Animation, pages 167–176, 2003.
[35] N. M.-T. S. Kshirsagar. Visyllable based speech animation. Computer Graphics
Forum, 22(3), 2003. (Eurographics 2003 Conference Proceedings).
[36] E. Sifakis, I. Neverov, and R. Fedkiw. Automatic determination of facial muscle
activations from sparse motion capture marker data. ACM Transactions on Graphics,
24(3):417–425, 2005. (SIGGRAPH 2005 Conference Proceedings).
[37] E. Sifakis, A. Selle, A. Robinson-Mosher, and R. Fedkiw. Simulating speech
with a physics-based facial muscle model. In Proceedings of the 2006 ACM SIGGRAPH/
Eurographics Symposium on Computer Animation, pages 261–270, 2006.
[38] R. W. Sumner and J. Popovi’c. Deformation transfer for triangle meshes. ACM
Transactions on Graphics, 23(3):399–405, 2004. (SIGGRAPH 2004 Conference
Proceedings).
[39] D. Vlasic, M. Brand, H. Pfister, and J. Popovi’c. Face transfer with multilinear models.
ACM Transactions on Graphics, 24(3):426–433, 2005. (SIGGRAPH 2005
Conference Proceedings).
[40] K. Wampler, D. Sasaki, L. Zhang, and Z. Popovi’c. Dynamic, expressive
speech animation from a single mesh. In Proceedings of the 2007 ACM SIGGRAPH/
Eurographics Symposium on Computer Animation, pages 53–62, 2007.
[41] Y. Wang, X. Huang, C.-S. Lee, S. Zhang, Z. Li, D. Samaras, D. Metaxas, A. Elgammal,
and P. Huang. High resolution acquisition, learning and transfer of dynamic 3-d
facial expressions. Computer Graphics Forum, 23(3):677–686, 2004. (Eurographics
2004 Conference Proceedings).
[42] X. M. Zhigang Deng. Perceptually guided expressive facial animation. Symposium
on Computer Animation, pages 67–76, 2008. (Proceedings of the 2008 ACM SIGGRAPH/
Eurographics Symposium on Computer Animation).
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46594-
dc.description.abstract說話動畫在傳統上被視為一個相當重要卻極富困難度的研究主題,並且由於臉部肌肉間複雜的肌理結構與快速的變化,使得對嘴動畫更具挑戰性。
截至目前為止有許多對嘴動畫的相關研究已經被提出,但其中並沒有很快速而且有效率的方法。在本論文中我們提出了一個有效率的機制:針對指定的角色模型,給予聲音和台詞來生成對嘴動畫。在本系統中使用動畫控制信號作為訓練資料,首先將訓練資料分群並個別利用最大期望演算法的方式學習出動作元素主導模式(dominated animeme model),此動作元素主導模式分為兩部分:一為多項式型態的動畫元素,另一為相對應的高斯函數,主要用來模擬協同構音的相互影響。最後,給定聲音與台詞,即可運用動作元素主導模式來生成新的動畫控制信號已達到對嘴動畫的效果。本論文的結果能保留角色模型本身的形狀特色,並且由於生成動畫控制信號所花費的時間接近即時,此項技術能夠廣泛的使用在對嘴動畫的樣板、多國語言對嘴動畫、大量的動畫製作等應用。
zh_TW
dc.description.abstractSpeech animation is traditionally considered as important but tedious work for most applications, especially when taking lip synchronization (lip-sync) into consideration, because the muscles on the face are complex and interact dynamically. Although there are several methods proposed to ease the burden on artists to create facial and speech animation, almost none are fast and efficient. In this thesis, we introduce a framework for synthesizing lip-sync character speech animation from a given speech sequence and its corresponding text. Starting from clustering the training data and training the dominated animeme models for every group in each kind of phoneme by learning the animation control signals of the character through an EM-style optimization approach, and further decomposing the dominated animeme models to the polynomial-fitted animeme models and corresponding dominance functions while taking coarticulation into account. Finally, given a novel speech sequence and its corresponding text, a lip-sync character animation can be synthesized in a very short time with the dominated animeme models. The synthesized lip-sync animation can even preserve exaggerated characteristics of the character’s facial geometry. Moreover, since our method can synthesize an acceptable and robust lip-sync animation in almost realtime, it can be used for many applications, such as lip-sync animation prototyping, multilingual animation reproduction, avatar speech, mass animation production, etc.en
dc.description.provenanceMade available in DSpace on 2021-06-15T05:17:41Z (GMT). No. of bitstreams: 1
ntu-99-R97922066-1.pdf: 21642760 bytes, checksum: bf0cbadb2aa4ddeb5798ededbf6f3d74 (MD5)
Previous issue date: 2010
en
dc.description.tableofcontentsAbstract 3
1 Introduction 11
2 RelatedWork 15
2.1 Facial Animation and Modeling 15
2.2 Lip-Sync Speech Animation 16
3 System Overview 21
4 Data Collection and Cross-Mapping 25
4.1 Data Collection 25
4.2 Cross-mapping 26
5 Dominated Animeme Model 29
5.1 Animeme Clustering 30
5.2 Animeme Modeling 31
5.3 Dominance Function 34
5.4 Dominated Animeme Model Construction 36
5.5 Dominated Animeme Model Selection 36
5.6 Dominated Animeme Model Synthesis 38
6 Experimental Results and Discussion 39
7 Conclusion and FutureWork 51
Bibliography 53
dc.language.isoen
dc.subject臉部動畫zh_TW
dc.subject對嘴動畫zh_TW
dc.subject說話動畫zh_TW
dc.subjectspeech animationen
dc.subjectfacial animationen
dc.subjectlip-sync speech animationen
dc.title利用動作元素主導模式之角色對嘴動畫zh_TW
dc.titleAnimating Lip-Sync Characters with Dominated Animeme Modelsen
dc.typeThesis
dc.date.schoolyear98-2
dc.description.degree碩士
dc.contributor.oralexamcommittee林文杰(Wen-Chieh Lin),林奕成(I-Chen Lin)
dc.subject.keyword對嘴動畫,說話動畫,臉部動畫,zh_TW
dc.subject.keywordspeech animation,lip-sync speech animation,facial animation,en
dc.relation.page57
dc.rights.note有償授權
dc.date.accepted2010-07-21
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-99-1.pdf
  未授權公開取用
21.14 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved