Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32717
Title: 語音驅動之3維人臉動畫
Speech-Driven 3D Facial Animation
Authors: Jun-Ze Huang
黃鈞澤
Advisor: 陳炳宇(Bing-Yu Chen)
Co-Advisor: 莊永裕(Yung-Yu Chuang)
Keyword: 語音,臉部動畫,追蹤,
speech,facial animation,tracking,
Publication Year : 2006
Degree: 碩士
Abstract: 要創造一段講特定語音的3維人臉動畫是很困難的。即使是對一個專業的動畫師而言, 也要花費許多的時間。我們的研究提供了一個語音驅動的3維人臉動畫系統,可以讓使用者輕易地產生人臉動畫。使用者只要給一段語音當成輸入,我們的系統就會輸出一段講輸入語音的3維人臉動畫

我們的系統主要分為三個部份: 第一個是MMM (multidimensional morphable model).MMM主要是利用機械學習的方式,從訓練影片所建立的模型。我們使用MMM來產生對應輸入語音的真實語音影片。
第二個部份是臉部追蹤。臉部追蹤可以從合成的語音影片中找出在人臉上的特徵點所在位置。
第三個部份是Mesh-IK(mesh based inverse kinematics).Mesh-IK以特徵點的移動當成指導方針來變形3維人臉模型,並且使得產生的模型相似於對應的語音影片的影格。所以我們可以輸出一段3維的人臉動畫。
臉部追蹤和Mesh-IK也可把真實的語音影片或表情影片當成輸入,然後產生對應的語音或是表情的人臉動畫。
It is often difficult to animate a face model speaking a specific speech. Even for professional animators, it will take a lot of time. Our work provides a speech-driven 3D facial animation system which allows the user to easily generate facial animations. The user only needs to give a speech as the input. The output will be a 3D facial animation relative to the input speech.
Our work can be divided into three sub-systems: One is the MMM (multidimensional morphable model). MMM is build from the pre-recorded training video using machine learning techniques. We can use MMM to generate realistic speech video respect to the input speech.
The second part is Facial Tracking. Facial Tracking can extract the feature points of a human subject in the synthetic speech video.
The third part is Mesh-IK (mesh based inverse kinematics). Mesh-IK takes the motion of feature points as the guide line to deform 3D face models, and makes the result model have the same looking in the corresponding frame of the speech video. Thus we can have a 3D facial animation as the output.

Facing Tracking and Mesh-IK can also take a real speech video or even a real expression video as the input, and produce the corresponding facial animations.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32717
Fulltext Rights: 有償授權
Appears in Collections:資訊管理學系

Files in This Item:
File SizeFormat 
ntu-95-1.pdf
  Restricted Access
2.48 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved