Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95814
Title: | 三維虛擬分身重建系統基於分數蒸餾採樣與顯式表示 HARDER: 3D Human Avatar Reconstruction with Distillation and Explicit Representation |
Authors: | 游鈞皓 Chun-Hau Yu |
Advisor: | 傅立成 Li-Chen Fu |
Keyword: | 虛擬分身,三維人體重建,潛在擴散模型,分數蒸餾採樣, Avatar,3D Human Reconstruction,Latent Diffusion Models,Score Distillation Sampling, |
Publication Year : | 2024 |
Degree: | 碩士 |
Abstract: | AR/VR 技術因其在教育、社交活動、數位娛樂和遠程協作等潛在應用而在近年來備受關注。而在虛擬空間中,每個使用者都需要一個 3D 虛擬分身來代表自己。因此,本論文提出了一個三維虛擬分身重建系統,讓使用者可以重建出一個完整著色、穿戴服裝且可實時動畫的三維虛擬分身。
現有的大多數方法中,若不是對資料要求過於嚴格,如深度資訊或多視角圖片,就是在特定區域會出現顯著的性能下降。因此,我們引入了 Score Distillation Sampling (SDS) 技術,並設計了 FSIC 和 RADR 等模組,讓 Latent Diffusion Model (LDM) 引導重建過程以提升重建結果。此外,我們開發了多種訓練策略,包括 personalized LDM、delayed SDS、focused SDS 及multi-pose SDS,來讓訓練過程更有效率。 我們的虛擬分身採用顯示表示,與現代大多數電腦圖學管線相容。並且,HARDER 僅需一張 RGB 圖片即能產生高度逼真的三維虛擬分身,且整個重建和實時動畫過程可在單一消費級 GPU 上完成,使此應用更加普及。 AR/VR technology has gained much attention in recent years due to its potential applications such as education, social activities, digital entertainment, and remote collaboration. A 3D human avatar is needed to represent each user in the virtual space. Therefore, we propose HARDER, a 3D human avatar reconstruction system that allows users to reconstruct a fully textured, clothed, and real-time animatable 3D human avatar. Most existing methods either impose overly strict data requirements, such as depth information or multi-view images, or suffer from significant performance drops in specific areas. To address these challenges, we introduce the Score Distillation Sampling (SDS) technique and design the FSIC and RADR modules to let the Latent Diffusion Model (LDM) guide the reconstruction process, especially in unseen regions. Furthermore, we have developed various training strategies including personalized LDM, delayed SDS, focused SDS, and multi-pose SDS to make the training process more efficient. Our avatars use an explicit representation that is compatible with modern computer graphics pipelines. Also, HARDER can generate highly realistic 3D avatars from just a single RGB image, and the entire reconstruction and real-time animation process can be completed on a single consumer-grade GPU, making this application more accessible. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95814 |
DOI: | 10.6342/NTU202403194 |
Fulltext Rights: | 未授權 |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-112-2.pdf Restricted Access | 2.55 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.