基於StyleGAN的高斯混合形變之三維風格化頭像建模

鞠睿陽; Rui-Yang Ju

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97708

標題:	基於StyleGAN的高斯混合形變之三維風格化頭像建模 StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars
作者:	鞠睿陽 Rui-Yang Ju
指導教授:	洪一平 Yi-Ping Hung
關鍵字:	生成式人工智慧,生成對抗網路,高斯潑濺,高斯混合形變,三維頭像重建,三維風格化頭像,臉部動畫, Generative AI,Generative Adversarial Network,Gaussian Splatting,Gaussian Blendshape,3D Head Reconstruction,3D Stylized Head Avatar,Facial Animation,
出版年 :	2025
學位:	碩士
摘要:	3D Gaussian Blendshapes 的提出，使得從單目視頻中即時重建可驅動動畫的頭像成為可能。另一方面，基於 StyleGAN 框架的 Toonify 方法在人臉圖像風格化領域中亦展現出廣泛應用潛力。為實現風格化的三維高斯混合形變（Gaussian Blenshapes）頭像建模，我們提出一種高效的兩階段框架 ToonifyGB。在第一階段（風格化視頻生成），我們採用改良的 StyleGAN 模型對輸入視頻進行風格轉換，突破了需於固定解析度下裁剪與對齊人臉的限制，從而產生更穩定且一致的風格化視頻輸出。此穩定輸出有助於後續階段更準確地捕捉面部表情細節，進一步提升混合形變的建模效率與動畫品質。第二階段（基於高斯混合形變的三維風格化頭像合成）中，我們從風格化視頻中學習風格化頭像的一個中性模型以及一組表情混合形變。這些混合形變在保留真實人臉表情特徵的同時，結合中性模型，可實現多樣且動態的風格化頭像生成。我們在多個基準數據集上，針對兩種不同風格（奧術"Arcane"與皮克斯"Pixar"）進行實驗，結果證明所提出的方法在視覺一致性與表情驅動能力上均具備良好的通用性與表現力。 The introduction of 3D Gaussian blendshapes has enabled the real-time reconstruction of animatable head avatars from monocular video. Toonify, a StyleGAN-based method, is widely adopted for facial image stylization. To extend Toonify for synthesizing diverse stylized 3D head avatars using Gaussian blendshapes, we propose an efficient two-stage framework, ToonifyGB. In Stage 1 (stylized video generation), we adopt an improved StyleGAN to generate the stylized video from the input video frames, which overcomes the limitation of requiring fixed resolution, cropping aligned faces as preprocessing for normal StyleGAN. This process produces a more stable stylized video, which helps Gaussian blendshapes more accurately capture high-frequency facial details, which facilitates the synthesis of high-quality animations in the next stage. In Stage 2 (Gaussian blendshapes synthesis), we learn a stylized neutral head model and a set of expression blendshapes from the generated stylized video. By combining the neutral head model with expression blendshapes, ToonifyGB can efficiently render stylized avatars with arbitrary expressions. We validate the effectiveness of ToonifyGB on benchmark datasets using two representative styles: Arcane and Pixar.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97708
DOI:	10.6342/NTU202501155
全文授權:	同意授權(全球公開)
電子全文公開日期:	2025-07-12
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf	122.76 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。