Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92605
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor簡韶逸zh_TW
dc.contributor.advisorShao-Yi Chienen
dc.contributor.author劉子傑zh_TW
dc.contributor.authorTzu-Chieh Liuen
dc.date.accessioned2024-05-08T16:06:07Z-
dc.date.available2024-05-09-
dc.date.copyright2024-05-08-
dc.date.issued2024-
dc.date.submitted2024-04-26-
dc.identifier.citationPersonal avatar for vr-chat. [Online]. Available: https://www.fiverr.com/ holomiru/create-your-personal-avatar-for-vr-chat-from-a-photo
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communica- tions of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
X. Huang and S. Belongie, “Arbitrary style transfer in real-time with adaptive instance normalization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 1501–1510.
T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4401–4410.
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
K. Gao, Y. Gao, H. He, D. Lu, L. Xu, and J. Li, “Nerf: Neural radiance field in 3d vision, a comprehensive review,” arXiv preprint arXiv:2210.00379, 2022.
B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ra- mamoorthi, R. Ng, and A. Kar, “Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines,” ACM Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–14, 2019.
V. Sitzmann, M. Zollho ̈fer, and G. Wetzstein, “Scene representation networks: Continuous 3d-structure-aware neural scene representations,” Advances in Neural Information Processing Systems, vol. 32, 2019.
S. Lombardi, T. Simon, J. Saragih, G. Schwartz, A. Lehrmann, and Y. Sheikh, “Neural volumes: Learning dynamic renderable volumes from images,” arXiv preprint arXiv:1906.07751, 2019.
A. Tewari, J. Thies, B. Mildenhall, P. Srinivasan, E. Tretschk, W. Yifan, C. Lassner, V. Sitzmann, R. Martin-Brualla, S. Lombardi et al., “Advances in neural rendering,” in Computer Graphics Forum, vol. 41, no. 2. Wiley Online Library, 2022, pp. 703–735.
J. T. Kajiya and B. P. Von Herzen, “Ray tracing volume densities,” ACM SIGGRAPH computer graphics, vol. 18, no. 3, pp. 165–174, 1984.
M. Niemeyer, J. T. Barron, B. Mildenhall, M. S. Sajjadi, A. Geiger, and N. Radwan, “Regnerf: Regularizing neural radiance fields for view synthe- sis from sparse inputs,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5480–5490.
K. Deng, A. Liu, J.-Y. Zhu, and D. Ramanan, “Depth-supervised nerf: Fewer views and faster training for free,” in Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2022, pp. 12 882–12 891.
Y.-C. Guo, D. Kang, L. Bao, Y. He, and S.-H. Zhang, “Nerfren: Neural radiance fields with reflections,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 409–18 418.
K. Schwarz, Y. Liao, M. Niemeyer, and A. Geiger, “Graf: Generative radi- ance fields for 3d-aware image synthesis,” Advances in Neural Information Processing Systems, vol. 33, pp. 20 154–20 166, 2020.
E. R. Chan, M. Monteiro, P. Kellnhofer, J. Wu, and G. Wetzstein, “pi-gan: Pe- riodic implicit generative adversarial networks for 3d-aware image synthesis,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 5799–5809.
M. Niemeyer and A. Geiger, “Giraffe: Representing scenes as compositional generative neural feature fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11 453–11 464.
J. Gu, L. Liu, P. Wang, and C. Theobalt, “Stylenerf: A style-based 3d-aware generator for high-resolution image synthesis,” arXiv preprint arXiv:2110.08985, 2021.
R. Or-El, X. Luo, M. Shan, E. Shechtman, J. J. Park, and I. Kemelmacher- Shlizerman, “Stylesdf: High-resolution 3d-consistent image and geometry generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 503–13 513.
P. Zhou, L. Xie, B. Ni, and Q. Tian, “Cips-3d: A 3d-aware generator of gans based on conditionally-independent pixel synthesis,” arXiv preprint arXiv:2110.09788, 2021.
Y. Xu, S. Peng, C. Yang, Y. Shen, and B. Zhou, “3d-aware image synthesis via learning structural and textural representations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 430–18 439.
E. R. Chan, C. Z. Lin, M. A. Chan, K. Nagano, B. Pan, S. De Mello, O. Gallo, L. J. Guibas, J. Tremblay, S. Khamis et al., “Efficient geometry-aware 3d generative adversarial networks,” in Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2022, pp. 16 123–16 133.
Y. Shi, X. Yang, Y. Wan, and X. Shen, “Semanticstylegan: Learning com- positional generative priors for controllable image synthesis and editing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 254–11 264.
A. Chen, R. Liu, L. Xie, Z. Chen, H. Su, and J. Yu, “Sofgan: A portrait image generator with dynamic styling,” ACM Transactions on Graphics (TOG), vol. 41, no. 1, pp. 1–26, 2022.
J. Sun, X. Wang, Y. Zhang, X. Li, Q. Zhang, Y. Liu, and J. Wang, “Fenerf: Face editing in neural radiance fields,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7672– 7682.
J. Sun, X. Wang, Y. Shi, L. Wang, J. Wang, and Y. Liu, “Ide-3d: Interactive disentangled editing for high-resolution 3d-aware portrait synthesis,” ACM Transactions on Graphics (TOG), vol. 41, no. 6, pp. 1–10, 2022.
Y. Xu, Z. Shu, C. Smith, J.-B. Huang, and S. W. Oh, “In-n-out: Face video inversion and editing with volumetric decomposition,” arXiv preprint arXiv:2302.04871, 2023.
W. Xia, Y. Zhang, Y. Yang, J.-H. Xue, B. Zhou, and M.-H. Yang, “Gan inversion: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Ana- lyzing and improving the image quality of stylegan,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8110–8119.
T. Park, M.-Y. Liu, T.-C. Wang, and J.-Y. Zhu, “Semantic image synthesis with spatially-adaptive normalization,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 2337–2346.
C.-H. Lee, Z. Liu, L. Wu, and P. Luo, “Maskgan: Towards diverse and interactive facial image manipulation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
Y. Deng, J. Yang, S. Xu, D. Chen, Y. Jia, and X. Tong, “Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0.
A. L. Maas, A. Y. Hannun, A. Y. Ng et al., “Rectifier nonlinearities improve neural network acoustic models,” in Proc. icml, vol. 30, no. 1. Atlanta, Georgia, USA, 2013, p. 3.
L. Mescheder, A. Geiger, and S. Nowozin, “Which training methods for gans do actually converge?” in International conference on machine learning. PMLR, 2018, pp. 3481–3490.
M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
M. Bin ́kowski, D. J. Sutherland, M. Arbel, and A. Gretton, “Demystifying mmd gans,” arXiv preprint arXiv:1801.01401, 2018.
C.-H. Lee, Z. Liu, L. Wu, and P. Luo, “Maskgan: Towards diverse and interactive facial image manipulation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5549– 5558.
J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4690–4699.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92605-
dc.description.abstract近年來,隨著神經渲染技術的進步,三維生成對抗網路(3D-GANs)在高保真多視角一致性肖像生成方面取得了重大進展。然而,現有方法仍難以對三維肖像的幾何和紋理進行明確與細緻的操作。這一限制帶來了顯著的障礙,因人們通常希望對其虛擬化身進行更廣泛的編輯,例如配飾的穿戴,以在虛擬世界中創造迷人與時尚的替身。因此,將肖像與配飾(或廣義至各個肖像外的部件)分離,並具有獨立控制各個部件的能力就顯得極為重要。
在本篇論文中,我們提出了多視角肖像配件生成網路(FOVAD),以生成具有各自分離屬性的三維肖像配飾。具體地說,為了將現有三維模型擴展至更全面的編輯能力,我們將整個場景的建模分解為兩個部分,僅肖像和僅配件,並將它們重新組合以進行重建。我們進一步引進了「由草圖生成配件」模組(Sketch2Accessory),它根據用戶隨意的筆跡或塗鴉生成對應的配件,並擁有自由選擇配件紋理的能力。藉由FOVAD網路,多視角肖像可以達到比起過去方法在此限制條件下,更好的操縱自由度與更佳的生成結果。從全局風格化、局部編輯到用戶指定的配件配戴,可能性是無窮的。
zh_TW
dc.description.abstractRecently, with the immense progress of neural rendering techniques, 3D-aware Generative Adversarial Networks (GANs) leap forward toward high-fidelity multi-view consistent portrait synthesis. However, existing approaches still struggle to provide fine-grained control over both geometry and texture of 3D portraits. This restriction presents a noteworthy obstacle given that people typically fancy diversely accessorizing their virtual avatars and creating fashion-forward looks. As such, decoupling portraits and accessories (controls over individual components) with different shapes and appearances (controls over geometry and texture space) is highly in demand.
In this work, we propose FOVAD (abbreviation for Free-View Portrait Adornment) generative network and bridge the aforementioned valley to generate a diverse set of accessories on 3D portraits with respective disentangled attributes. More specifically, to expand the existing 3D models to a more comprehensive level of editability, we decompose the modeling of the entire scene into two distinct parts: portrait only and decorative attributes, and re-compose them for reconstruction, while only employing a single scene-shared representation. Detaching accessories from portraits facilitates effortless “Sketch2Accessory”, which creates accessories based on the user’s unrestricted accessory sketch or doodle. Armed with FOVAD framework, free-view portraits can be customized in ways that were once impossible with only monocular, readily collectible training data. From global or local editing to multiple user-specified accessories wearing, the possibilities are endless.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-05-08T16:06:07Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-05-08T16:06:07Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAbstract i
List of Figures iv
List of Figures vi
1 Introduction 1
1.1 Background 1
1.2 Free-view Portrait Adornment 3
1.3 Contribution 6
1.4 Thesis Organization 6
2 Related Work 8
2.1 2D Image Synthesis 8
2.2 Neural Radiance Fields (NeRFs) 10
2.3 3D-aware Image Synthesis 13
2.4 Controllable Portrait Synthesis 16
2.5 GAN Inversion 18
3 Proposed Methodology 19
3.1 Triplane Scene-Shared Representation 24
3.1.1 Bias-Conscious Mapper 24
3.1.2 Portrait Geometry Field 26
3.1.3 Accessory Viable Field 28
3.2 Structure-Guided Colorizer 29
3.3 Three-Pronged Training Scheme 32
3.4 Sketch2Accessories Inversion 34
4. Experiments 37
4.1 Implementation Details 37
4.2 Datasets and Evaluation Metrics 38
4.3 Comparisons 38
4.4 Ablation Study 39
4.5 Applications 42
4.6 Visualization and Additional Results 44
5 Conclusion 49
Reference 50
-
dc.language.isoen-
dc.subject神經輻射場zh_TW
dc.subject臉部編輯zh_TW
dc.subject影像合成zh_TW
dc.subject三維生成對抗網路zh_TW
dc.subjectFace editingen
dc.subjectneural radiance fieldsen
dc.subjectneural renderingen
dc.subject3D GANen
dc.subjectImage synthesisen
dc.subjectsemantic-mask-based interfacesen
dc.title在神經輻射場中實現更全面的三維人像編輯zh_TW
dc.titleElevate 3D Editing of Neural Portrait to a More Comprehensive Levelen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee曹昱;陳祝嵩;莊永裕;鄭文皇zh_TW
dc.contributor.oralexamcommitteeYu Tsao;Chu-Song Chen;YY Chuang;Wen-Huang Chengen
dc.subject.keyword臉部編輯,影像合成,三維生成對抗網路,神經輻射場,zh_TW
dc.subject.keywordFace editing,Image synthesis,semantic-mask-based interfaces,neural radiance fields,neural rendering,3D GAN,en
dc.relation.page55-
dc.identifier.doi10.6342/NTU202400881-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2024-04-26-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電子工程學研究所-
dc.date.embargo-lift2029-04-21-
顯示於系所單位:電子工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
  未授權公開取用
6.48 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved