四維高斯潑灑資料擷取系統建構與測試

黃義峰; I-Feng Huang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99649

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	洪一平	zh_TW
dc.contributor.advisor	Yi-Ping Hung	en
dc.contributor.author	黃義峰	zh_TW
dc.contributor.author	I-Feng Huang	en
dc.date.accessioned	2025-09-17T16:15:41Z	-
dc.date.available	2025-09-18	-
dc.date.copyright	2025-09-17	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-05	-
dc.identifier.citation	[1] Y. Aksoy, T. O. Aydın, M. Pollefeys, and A. Smolić. Interactive high-quality green screen keying via color unmixing. ACM Trans. Graph., 35(5):152:1–152:12, 2016. [2] A. Bai, Z. Yang, J. Qin, Y. Shi, and K. Li. Lwgss: Light-weight green spill suppression for green screen matting. In J. Ren, A. Hussain, I. Y. Liao, R. Chen, K. Huang, H. Zhao, X. Liu, P. Ma, and T. Maul, editors, Advances in Brain Inspired Cognitive Systems, pages 365–374, Singapore, 2024. Springer Nature Singapore. [3] B. Chen, C. Zhou, and X. Sun. Enhancing green screen matting with group normalization and perceptual loss for color overflow and complex edges. The Visual Computer, 2025. Accepted: 14 May 2025, Published: 09 June 2025. [4] H. Gao, R. Li, S. Tulsiani, B. Russell, and A. Kanazawa. Monocular dynamic view synthesis: A reality check. In NeurIPS, 2022. [5] M. Işık, M. Rünz, M. Georgopoulos, T. Khakhulin, J. Starck, L. Agapito, and M. Nießner. Humanrf: High-fidelity neural radiance fields for humans in motion. ACM Transactions on Graphics (TOG), 42(4):1–12, 2023. [6] B. E. Jackson, D. J. Evangelista, D. D. Ray, and T. L. Hedrick. 3d for the people: multi-camera motion capture in the field with consumer-grade cameras and open source software. Biology Open, 5(9):1334–1342, 2016. [7] Y. Jiang, Z. Shen, P. Wang, Z. Su, Y. Hong, Y. Zhang, J. Yu, and L. Xu. Hifi4g: High-fidelity human performance rendering via compact gaussian splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19734–19745, June 2024. [8] Y. Jin, Z. Li, D. Zhu, et al. Automatic and real-time green screen keying. The Visual Computer, 38:3135–3147, Sept. 2022. Published: 20 June 2022; Accepted: 18 May 2022. [9] B.Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020. [10] K. Park, U. Sinha, J. T. Barron, S. Bouaziz, D. B. Goldman, S. M. Seitz, and R. Martin-Brualla. Nerfies: Deformable neural radiance fields. ICCV, 2021. [11] K. Park, U. Sinha, P. Hedman, J. T. Barron, S. Bouaziz, D. B. Goldman, R. Martin Brualla, and S. M. Seitz. Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph., 40(6), dec 2021. [12] Y. Rong, K. Li, H. Li, Y. Furukawa, Z. Yan, and S. Tulsiani. Neural 3D Video: A dataset and benchmark. https://github.com/facebookresearch/Neural_3D_Video, 2022. GitHub repository, accessed 13 Jul 2025. [13] Underwater Photography Guide. Manual white balance techniques for underwater photography. https://www.uwphotographyguide.com/underwater-photography-ambient-light-manual-white-balance, n.d. Accessed: 2025-07-06. [14] L. Wang, Q. Hu, Q. He, Z. Wang, J. Yu, T. Tuytelaars, L. Xu, and M. Wu. Neural residual radiance fields for streamably free-viewpoint videos. In Proceedings of the IEEE/CVF Conferenceon Computer Vision and Pattern Recognition (CVPR), pages 76–87, June 2023. [15] P. Wang, Z. Zhang, L. Wang, K. Yao, S. Xie, J. Yu, M. Wu, and L. Xu. V^ 3: Viewing volumetric videos on mobiles via streamable 2d dynamic gaussians. ACM Transactions on Graphics (TOG), 43(6):1–13, 2024. [16] Z. Xu, Y. Xu, Z. Yu, S. Peng, J. Sun, H. Bao, and X. Zhou. Representing long volumetric video with temporal gaussian hierarchy. ACM Transactions on Graphics, 43(6), November 2024.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99649	-
dc.description.abstract	隨著高斯潑灑等神經渲染技術的興起，三維重建進入了低成本、高品質的新時代。然而，在進行四維場景重建時，為了維持重建品質，仍需仰賴多台高階攝影機與專攝影棚的拍攝環境，對一般使用者而言門檻仍高、成本不易負擔。本研究旨在探討如何在低成本、不穩定的場地條件下，建構一套可用於四維高斯潑灑訓練的高品質資料拍攝與處理流程。所提出的系統涵蓋了多機架設策略、影像同步、色彩校正、前景擷取與相機參數計算等環節，並以商用iPhone裝置實作驗證其可行性。本研究採用V^3模型（Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians）進行訓練與重建，驗證所建立資料集在不同實驗配置下的有效性。透過本方法，期望能降低四維重建技術的門檻，促進其在教育、娛樂與個人應用中的普及化。	zh_TW
dc.description.abstract	Neural rendering techniques such as Gaussian Splatting has significantly lowered the cost of high-quality 3D reconstruction. However, its extension to 4D reconstruction still typically relies on multi-camera setups and controlled studio environments, which pose practical barriers for general users. This thesis proposes a low-cost and flexible data acquisition pipeline for 4D Gaussian Splatting using consumer-grade iPhone devices. The pipeline integrates synchronized multi-camera capture, color correction, foreground extraction, and camera parameter estimation. We integrate the captured data into a modified version of the V^3 framework (Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians) to validate the feasibility and effectiveness of our approach. Experimental results demonstrate that high quality training data can be acquired under non-professional conditions, thereby making 4D reconstruction more accessible to a wider range of users.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-09-17T16:15:41Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-09-17T16:15:41Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements i 摘要 ii Abstract iii Contents v List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Related Works 3 2.1 4D Capture Systems: From Studios to Consumer-Grade Solutions 3 2.2 Photometric preprocessing for Multi-View Consistency 4 Chapter 3 System Design 7 3.1 System Overview 7 3.2 Camera Settings 7 3.3 Video Capture Workflow 9 3.3.1 Preview 9 3.3.2 Outlier Inspection and Representative Calibration Selection 10 3.4 Preprocess I: Temporal and Color Alignment 11 3.4.1 Time Synchronization 12 3.4.2 White Balance Correction 13 3.5 Background Subtraction and Foreground Extraction 14 3.6 Preprocess II: Progressive HSV-Based Color Spill Suppression 16 Chapter 4 Experiment 20 4.1 Impact of Multi-Camera Temporal Synchronization Accuracy on 4DGS Quality 20 4.2 Camera Density and Baseline Spacing Study 23 Chapter 5 Discussion 28 5.1 Practical Considerations in Multi-Camera Calibration 28 5.2 Limitations of Background Matting in Multi-View Capture 30 5.3 BGMv2 Versus SAM2 31 5.4 Limitations 33 5.5 Future Work 34 Chapter 6 Conclusion 35 References 37	-
dc.language.iso	en	-
dc.subject	高斯潑灑	zh_TW
dc.subject	四維重建	zh_TW
dc.subject	訓練資料	zh_TW
dc.subject	影像擷取	zh_TW
dc.subject	多視角	zh_TW
dc.subject	Multi-view Imaging	en
dc.subject	Neural Rendering	en
dc.subject	Training Data Generation	en
dc.subject	Image Acquisition	en
dc.subject	4D Gaussian Splatting	en
dc.title	四維高斯潑灑資料擷取系統建構與測試	zh_TW
dc.title	Construction and Evaluation of a Data Acquisition System for 4D Gaussian Splatting	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳祝嵩;林經堯;陳維超;楊奕軒	zh_TW
dc.contributor.oralexamcommittee	Chu-Song Chen;Jin-Yao Lin;Wei-Chao Chen;Yi-Hsuan Yang	en
dc.subject.keyword	高斯潑灑,四維重建,訓練資料,影像擷取,多視角,	zh_TW
dc.subject.keyword	4D Gaussian Splatting,Neural Rendering,Training Data Generation,Image Acquisition,Multi-view Imaging,	en
dc.relation.page	39	-
dc.identifier.doi	10.6342/NTU202503904	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2025-08-11	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	2030-08-05	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	1.39 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。