Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94488
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor楊奕軒zh_TW
dc.contributor.advisorYi-Hsuan Yangen
dc.contributor.author吳軍霆zh_TW
dc.contributor.authorChun-Tin Wuen
dc.date.accessioned2024-08-16T16:19:41Z-
dc.date.available2024-08-17-
dc.date.copyright2024-08-16-
dc.date.issued2024-
dc.date.submitted2024-08-14-
dc.identifier.citation[1] B. Attal, J.-B. Huang, C. Richardt, M. Zollh ̈ofer, J. Kopf, M. O’Toole, andC. Kim. Hyperreel: High-fidelity 6-dof video with ray-conditioned sampling.arXiv preprint arXiv:2301.02238, 2023.
[2] J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. Martin-Brualla, andP. P. Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neuralradiance fields. In Proceedings of the IEEE/CVF International Conference onComputer Vision (ICCV), pages 5855–5864, 2021.
[3] A. Cao and J. Johnson. Hexplane: A fast representation for dynamic scenes.In Proceedings of the IEEE/CVF Conference on Computer Vision and PatternRecognition (CVPR), pages 130–141, 2023.
[4] J. Fang, T. Yi, X. Wang, L. Xie, X. Zhang, W. Liu, M. Nießner, and Q. Tian.Fast dynamic radiance fields with time-aware neural voxels. arXiv preprintarXiv:2205.15285, 2022.
[5] S. Fridovich-Keil, G. Meanti, F. Warburg, B. Recht, and A. Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. arXiv preprintarXiv:2301.10241, 2023.
[6] X. Gao, J. Yang, J. Kim, S. Peng, Z. Liu, and X. Tong. Mps-nerf: Generalizable3d human rendering from multiview images. IEEE Transactions on PatternAnalysis and Machine Intelligence, 2022.
[7] X. Guo, J. Sun, Y. Dai, G. Chen, X. Ye, X. Tan, E. Ding, Y. Zhang, andJ. Wang. Forward flow for novel view synthesis of dynamic scenes. In Proceed-ings of the IEEE/CVF International Conference on Computer Vision (ICCV),pages 16022–16033, 2023.
[8] B. Kerbl, G. Kopanas, T. Leimkuhler, and G. Drettakis. 3d gaussian splattingfor real-time radiance field rendering. ACM Transactions on Graphics (ToG),42(4):1–14, 2023.26
[9] T. Li, M. Slavcheva, M. Zollhoefer, S. Green, C. Lassner, C. Kim, T. Schmidt,S. Lovegrove, M. Goesele, R. Newcombe, et al. Neural 3d video synthesis frommulti-view video. In Proceedings of the IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), pages 5521–5531, 2022.
[10] H. Lin, S. Peng, Z. Xu, T. Xie, X. He, H. Bao, and X. Zhou. High-fidelityand real-time novel view synthesis for dynamic scenes. In SIGGRAPH AsiaConference Proceedings, 2023.
[11] Y.-L. Liu, C. Gao, A. Meuleman, H.-Y. Tseng, A. Saraf, C. Kim, Y.-Y. Chuang,J. Kopf, and J.-B. Huang. Robust dynamic radiance fields. In Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),pages 13–23, 2023.
[12] S. Lombardi, T. Simon, G. Schwartz, M. Zollhoefer, Y. Sheikh, and J. Saragih.Mixture of volumetric primitives for efficient neural rendering. ACM Transac-tions on Graphics (ToG), 40(4):1–13, 2021.
[13] T. Lu, M. Yu, L. Xu, Y. Xiangli, L. Wang, D. Lin, and B. Dai. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. arXiv preprintarXiv:2312.00109, 2023.
[14] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, andR. Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InProceedings of the European Conference on Computer Vision (ECCV), pages405–421. Springer, 2020.
[15] K. Park, U. Sinha, J. T. Barron, S. Bouaziz, D. B. Goldman, S. M. Seitz, andR. Martin-Brualla. Nerfies: Deformable neural radiance fields. In Proceedings ofthe IEEE/CVF International Conference on Computer Vision (ICCV), pages5865–5874, 2021.
[16] K. Park, U. Sinha, P. Hedman, J. T. Barron, S. Bouaziz, D. B. Goldman,R. Martin-Brualla, and S. M. Seitz. Hypernerf: A higher-dimensional rep-resentation for topologically varying neural radiance fields. arXiv preprintarXiv:2106.13228, 2021.
[17] S. Peng, Y. Yan, Q. Shuai, H. Bao, and X. Zhou. Representing volumetricvideos as dynamic mlp maps. In Proceedings of the IEEE/CVF Conference onComputer Vision and Pattern Recognition (CVPR), pages 4252–4262, 2023.27
[18] A. Pumarola, E. Corona, G. Pons-Moll, and F. Moreno-Noguer. D-nerf: Neuralradiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition (CVPR), pages 10318–10327,2021.
[19] J. L. Sch ̈onberger and J.-M. Frahm. Structure-from-motion revisited. In Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pages 4104–4113, 2016.
[20] R. Shao, Z. Zheng, H. Tu, B. Liu, H. Zhang, and Y. Liu. Tensor4d: Efficientneural 4d decomposition for high-fidelity dynamic reconstruction and rendering.In Proceedings of the IEEE/CVF Conference on Computer Vision and PatternRecognition (CVPR), pages 16632–16642, 2023.
[21] L. Song, A. Chen, Z. Li, Z. Chen, L. Chen, J. Yuan, Y. Xu, and A. Geiger.Nerfplayer: A streamable dynamic scene representation with decomposed neu-ral radiance fields. arXiv preprint arXiv:2210.15947, 2022.
[22] C. Sun, M. Sun, and H.-T. Chen. Direct voxel grid optimization: Super-fast con-vergence for radiance fields reconstruction. arXiv preprint arXiv:2306.01496,2023.
[23] F. Wang, Z. Chen, G. Wang, Y. Song, and H. Liu. Masked space-time hashencoding for efficient dynamic scene reconstruction. In Advances in NeuralInformation Processing Systems (NeurIPS), 2023.
[24] F. Wang, S. Tan, X. Li, Z. Tian, Y. Song, and H. Liu. Mixed neural voxels forfast multiview video synthesis. In Proceedings of the IEEE/CVF InternationalConference on Computer Vision (ICCV), pages 19706–19716, 2023.
[25] L. Wang, J. Zhang, X. Liu, F. Zhao, Y. Zhang, Y. Zhang, M. Wu, J. Yu, andL. Xu. Fourier plenoctrees for dynamic radiance field rendering in real-time.In Proceedings of the IEEE/CVF Conference on Computer Vision and PatternRecognition (CVPR), pages 13524–13534, June 2022.
[26] Y. Wang, Q. Han, M. Habermann, K. Daniilidis, C. Theobalt, and L. Liu.Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction.In Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV), pages 3295–3306, 2023.28
[27] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image qualityassessment: From error visibility to structural similarity. IEEE Transactionson Image Processing, 13(4):600–612, 2004.
[28] G. Wu, T. Yi, J. Fang, L. Xie, X. Zhang, W. Wei, W. Liu, Q. Tian, andX. Wang. 4d gaussian splatting for real-time dynamic scene rendering. arXivpreprint arXiv:2310.08528, 2023.
[29] Q. Xu, W. Kong, W. Tao, and M. Pollefeys. Multi-scale geometric consis-tency guided and planar prior assisted multi-view stereo. IEEE Transactionson Pattern Analysis and Machine Intelligence, 45(4):4945–4963, 2022.
[30] T. Yi, J. Fang, X. Wang, and W. Liu. Generalizable neural voxels for fasthuman radiance fields. arXiv preprint arXiv:2303.15387, 2023.
[31] A. Yu, R. Li, M. Tancik, H. Li, R. Ng, and A. Kanazawa. Plenoctrees forreal-time rendering of neural radiance fields. In Proceedings of the IEEE/CVFInternational Conference on Computer Vision (ICCV), pages 5752–5761, 2021.
[32] M. Yu, T. Lu, L. Xu, L. Jiang, Y. Xiangli, and B. Dai. Gsdf: 3dgs meets sdffor improved rendering and reconstruction. arXiv preprint arXiv:2403.16964,2024.
[33] K. Zhang, G. Riegler, N. Snavely, and V. Koltun. Nerf++: Analyzing andimproving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94488-
dc.description.abstract新視角合成在3D視覺應用中扮演著關鍵角色,如虛擬現實(VR)、增強現實(AR)和電影製作,能夠生成從場景中任意視點拍攝的圖像。這一任務對於動態場景至關重要,但由於複雜的運動和稀疏的數據,面臨諸多挑戰。為了解決這一任務並推動高質量渲染,神經輻射場(NeRF)被提出來通過使用神經隱式函數來表示場景。此外,其繼任者3D高斯濺射(3D-GS)通過採用高效的3D高斯投影進一步加速了渲染速度達到實時水平。然而,由於3D-GS的顯式特性,在使用4D數據渲染動態場景時,即使通過考慮高斯變形場來擴展3D-GS至4D高斯濺射(4D-GS)以實現準確和高效的渲染,其訓練和渲染成本仍然昂貴。
為了解決這些問題,本論文引入了基於網格的4D-GS方法,通過網格表示來進一步提高4D-GS在動態場景渲染中的效率。通過整合網格,我們的框架優化了計算複雜度,加速了處理時間,顯著減少了內存使用,同時保持了高渲染質量。這些在實時渲染能力方面的進步為未來在動態場景操控、重建和下游任務中的應用鋪平了道路。
zh_TW
dc.description.abstractNovel view synthesis plays a pivotal role in 3D vision applications like virtual reality (VR), augmented reality (AR), and movie production, enabling the generation of images from arbitrary viewpoints within a scene. This task, essential for dynamic scenes, faces challenges due to complex motion and sparse data. Neural Radiance Field (NeRF) has been proposed to tackle the task and advanced this field for high-quality rendering by representing scene with neural implicit functions. In addition, its successor, 3D Gaussian Splatting (3D-GS), have further accelerated the rendering speed to be real-time by employing efficient 3D Gaussian projections. However, due to the explicit nature of 3D-GS, when rendering dynamic scenes with 4D data, the training and rendering costs are still expensive, even after extending 3D-GS to 4D Gaussian Splatting (4D-GS) by considering temporal dynamics through leveraging Gaussian deformation fields for accurate and efficient rendering.
To address these issues, this thesis introduces Voxels to 4D-GS, enhancing 4D-GS with voxel-based representations for even greater efficiency in dynamic scene rendering. By integrating voxels, our framework optimizes computational complexity, accelerates processing times, and reduces memory usage significantly while maintaining high rendering quality. These advancements in real-time rendering capabilities pave the way for future applications in dynamic scene manipulation, reconstruction and downstream tasks.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T16:19:40Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-16T16:19:41Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgments i
Abstract ii
摘要 iii
Contents iv
List of Tables vi
List of Figures vii
1 Introduction 1
2 Related Works 3
2.1 NeRF Based Dynamic Scene Rendering . . . . . . . . . . . . . . . . . 3
2.2 4D K-Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 4D Gaussian Splatting . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Method 6
3.1 Model Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.1 3D Gaussian Splatting . . . . . . . . . . . . . . . . . . . . . . 7
3.2.2 Grid-based Gaussian Splatting . . . . . . . . . . . . . . . . . . 8
3.2.3 Gaussian Deformation Fields . . . . . . . . . . . . . . . . . . . 9
3.3 Scene Voxelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Neural Gaussian Generation . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 Deformation Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5.1 Spatial-Temporal Encoder . . . . . . . . . . . . . . . . . . . . 12
3.5.2 Gaussian Deformation Decoder . . . . . . . . . . . . . . . . . 13
3.6 Optimization Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6.1 First Phase: Coarse Training Stage . . . . . . . . . . . . . . . 13
3.6.2 Second Phase: Fine Training Stage . . . . . . . . . . . . . . . 14
4 Experiment and Discussions 15
4.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Real-world Datasets . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2.1 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . 15
4.2.2 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.1 Volume regularization . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4.1 Training Iterations . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5 Conclusions 25
Bibliography 26
-
dc.language.isoen-
dc.subject三維電腦視覺zh_TW
dc.subject動態場景渲染zh_TW
dc.subject三維高斯潑濺zh_TW
dc.subject深度學習zh_TW
dc.subjectDeep Learningen
dc.subjectDynamic Scene Renderingen
dc.subject3D Gaussian Spattingen
dc.subject3D Computer Visionen
dc.title網格用於高斯潑濺的即時動態場景渲染zh_TW
dc.titleReal-Time Dynamic Scene Rendering with Voxelized Gaussian Splattingen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.coadvisor陳駿丞zh_TW
dc.contributor.coadvisorJun-Cheng Chenen
dc.contributor.oralexamcommittee王鈺強;陳祝嵩zh_TW
dc.contributor.oralexamcommitteeYu-Chiang Wang;Chu-Song Chenen
dc.subject.keyword深度學習,三維電腦視覺,三維高斯潑濺,動態場景渲染,zh_TW
dc.subject.keywordDeep Learning,3D Computer Vision,3D Gaussian Spatting,Dynamic Scene Rendering,en
dc.relation.page29-
dc.identifier.doi10.6342/NTU202404161-
dc.rights.note未授權-
dc.date.accepted2024-08-14-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
  未授權公開取用
14.68 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved