Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98437
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor簡韶逸zh_TW
dc.contributor.advisorShao-Yi Chienen
dc.contributor.author周奕節zh_TW
dc.contributor.authorYi-Chieh Chouen
dc.date.accessioned2025-08-14T16:06:55Z-
dc.date.available2025-08-15-
dc.date.copyright2025-08-14-
dc.date.issued2025-
dc.date.submitted2025-07-31-
dc.identifier.citationB. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” in ECCV, 2020. vii, 2, 9
B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Transactions on Graphics, vol. 42, no. 4, July 2023. [Online]. Available: https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/ vii, 2, 10
D. Charatan, S. Li, A. Tagliasacchi, and V. Sitzmann, “pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction,” in CVPR, 2024. vii, 4, 10, 54
Z. Liu, H. Ouyang, Q. Wang, K. L. Cheng, J. Xiao, K. Zhu, N. Xue, Y. Liu, Y. Shen, and Y. Cao, “Infusion: Inpainting 3d gaussians via learning depth completion from diffusion prior,” arXiv preprint arXiv:2404.11613, 2024. viii, 13
L. Liu, X. Wang, J. Qiu, T. Lin, X. Zhou, and Z. Su, “Gaussian object carver: Object-compositional gaussian splatting with surface completion,” arXiv preprint arXiv:2412.02075, 2024. viii, 13, 54
Y. Chen, H. Xu, C. Zheng, B. Zhuang, M. Pollefeys, A. Geiger, T.-J. Cham, and J. Cai, “Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images,” arXiv preprint arXiv:2403.14627, 2024. 4, 10, 54
P. Wang, L. Liu, Y. Liu, C. Theobalt, T. Komura, and W. Wang, “Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction,” arXiv preprint arXiv:2106.10689, 2021. 9
L. Yariv, J. Gu, Y. Kasten, and Y. Lipman, “Volume rendering of neural implicit surfaces,” in Thirty-Fifth Conference on Neural Information Processing Systems, 2021. 9
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment anything,” arXiv:2304.02643, 2023. 11, 18
W. Abdulla, “Mask r-cnn for object detection and instance segmentation on keras and tensorflow,” https://github.com/matterport/Mask_RCNN, 2017. 12
C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” arXiv preprint arXiv:1612.00593, 2016. 12
A. Dai and M. Nießner, “3dmv: Joint 3d-multi-view prediction for 3d semantic scene segmentation,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018. 12
M. Ye, M. Danelljan, F. Yu, and L. Ke, “Gaussian grouping: Segment and edit anything in 3d scenes,” in ECCV, 2024. 12
I. Vizzo, X. Chen, N. Chebrolu, J. Behley, and C. Stachniss, “Poisson Surface Reconstruction for LiDAR Odometry and Mapping,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA), 2021. 12
W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert, “Pcn: Point completion network,” in 3D Vision (3DV), 2018 International Conference on, 2018. 13
T. Groueix, M. Fisher, V. G. Kim, B. Russell, and M. Aubry, “AtlasNet: A Papier-Mâché Approach to Learning 3D Surface Generation,” in Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2018. 13
X. Yu, Y. Rao, Z. Wang, Z. Liu, J. Lu, and J. Zhou, “Pointr: Diverse point cloud completion with geometry-aware transformers,” in ICCV, 2021. 13
X. Yan, L. Lin, N. J. Mitra, D. Lischinski, D. Cohen-Or, and H. Huang, “Shapeformer: Transformer-based shape completion via sparse representation,” 2022. 13, 28
K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” arXiv:2111.06377, 2021. 14
Y. Pang, W. Wang, F. E. Tay, W. Liu, Y. Tian, and L. Yuan, “Masked autoencoders for point cloud self-supervised learning,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II. Springer, 2022, pp. 604–621. 14
L. Yang, B. Kang, Z. Huang, X. Xu, J. Feng, and H. Zhao, “Depth anything: Unleashing the power of large-scale unlabeled data,” in CVPR, 2024. 22
A. Eftekhar, A. Sax, J. Malik, and A. Zamir, “Omnidata: A scalable pipeline for making multi-task mid-level vision datasets from 3d scans,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10 786–10 796. 22
A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” in arXiv preprint arXiv:1512.03012, 2015. 39
Q. Ma, Y. Li, B. Ren, N. Sebe, E. Konukoglu, T. Gevers, L. V. Gool, and D. P. Paudel, “Shapesplat: A large-scale dataset of gaussian splats and their self-supervised pretraining,” 2024. [Online]. Available: https://arxiv.org/abs/2408.10906 39
J. Straub, T. Whelan, L. Ma, Y. Chen, E. Wijmans, S. Green, J. J. Engel, R. Mur-Artal, C. Ren, S. Verma, A. Clarkson, M. Yan, B. Budge, Y. Yan, X. Pan, J. Yon, Y. Zou, K. Leon, N. Carter, J. Briales, T. Gillingham, E. Mueggler, L. Pesqueira, M. Savva, D. Batra, H. M. Strasdat, R. D. Nardi, M. Goesele, S. Lovegrove, and R. Newcombe, “The Replica dataset: A digital replica of indoor spaces,” arXiv preprint arXiv:1906.05797, 2019. 40
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98437-
dc.description.abstract室內場景的三維重建涉及從捕獲的數據(如圖像或 LiDAR)創建物理空間的三維表示。此過程對於數位孿生創建、AR/VR 場景生成和室內導航等應用至關重要。雖然近期的神經表示方法如神經輻射場(NeRF)和三維高斯散射(3DGS)能夠實現逼真的新視角合成,而稀疏視角方法降低了視角要求,但現實世界的捕獲場景仍然受到遮擋和有限視角的影響,導致不完整的觀測,在重建場景中產生空洞和偽影。
本論文提出了一個三階段的室內三維場景重建與補全方法。首先,我們使用多線索監督進行初始場景重建,結合深度和法向量監督以確保精確的表面對齊並減少 3DGS 表示中的偽影。其次,我們通過分割和追踪將複雜場景分解為個別物體,使物體層級的補全變得可行。最後,我們進行物體層級的補全,使用幾何優先的方法將幾何重建與外觀合成分離。我們使用形狀補全模型從部分點雲重建缺失的幾何,然後引入形狀感知高斯遮罩自編碼器,修改遮罩策略,選擇性地對新補全的區域進行遮罩處理,同時將原始可見區域視為未遮罩輸入。解碼器隨後為這些新增區域重建完整的三維高斯屬性。
實驗結果表明,我們的方法優於現有方法。與目前專注於局部空洞填補或物體移除的 3DGS 修復技術,或僅完成網格表示而不生成完整 3DGS 輸出的混合方法不同,我們的方法產生完整的室內場景 3DGS 表示。
zh_TW
dc.description.abstract3D reconstruction of indoor scenes involves creating three-dimensional digital representations of physical spaces from captured data such as images or LiDAR. This process is crucial for applications in the digital twin creation, AR/VR scene generation, and indoor navigation. While recent neural representations like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) enable photorealistic novel view synthesis, and sparse-view methods like pixelSplat reduce viewpoint requirements, real-world capture scenarios still suffer from occlusions and limited viewpoints, resulting in incomplete observations that cause holes and artifacts in reconstructed scenes.
This thesis presents a unified approach to reconstruction and completion for indoor 3D scenes through a novel three-stage pipeline. First, we perform initial scene reconstruction with multi-cue supervision, incorporating depth and normal supervision to ensure precise surface alignment and reduced artifacts in the 3DGS representation. Second, we decompose complex scenes into individual objects through segmentation and tracking, making completion tractable at the object level. Third, we employ object-level completion using a geometry-first approach that separates geometric reconstruction from appearance synthesis. We employ shape completion models to reconstruct missing geometry from partial point clouds, then introduce a Shape-Aware Gaussian-MAE that modifies the masking strategy to selectively mask only the newly completed regions while treating the originally visible areas as unmasked input. The decoder then reconstructs full 3D Gaussian attributes for these newly added regions.
Experimental results demonstrate that our approach outperforms existing methods. Unlike current 3DGS inpainting techniques, which focus on localized hole filling or object removal, or hybrid methods that only complete mesh representations without generating full 3DGS outputs, our method produces complete 3DGS representations of indoor scenes.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-14T16:06:54Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-08-14T16:06:55Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsMaster’s Thesis Acceptance Certificate i
致謝 iii
中文摘要 v
Abstract vii
Contents ix
List of Figures xiii
List of Tables xv
1 Introduction 1
1.1 3D Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 3D Gaussian Splatting . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Related Work 9
2.1 Neural 3D Scene Representations . . . . . . . . . . . . . . . . . 9
2.1.1 Sparse-View 3D Gaussian Splatting . . . . . . . . . . . . 10
2.2 Scene Decomposition and Segmentation . . . . . . . . . . . . . . 11
2.3 3D Shape Completion . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 3D Gaussian Splatting Completion and Editing . . . . . . . . . . 13
2.5 Masked Autoencoders and Self-Supervised Learning . . . . . . . 14
3 Proposed Method 17
3.1 Pipeline Overview . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Stage 1 - Scene Reconstruction with Multi-Cue Supervision . . . 18
3.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.2 Multi-Cue 3DGS Optimization . . . . . . . . . . . . . . . 23
3.3 Stage 2 - Scene Decomposition . . . . . . . . . . . . . . . . . . . 25
3.3.1 Object-Level Extraction Process . . . . . . . . . . . . . . 25
3.3.2 Spatial Coherence and Boundary Refinement . . . . . . . 26
3.3.3 Benefits of Object-Level Decomposition . . . . . . . . . 26
3.4 Stage 3 - Object-Level Completion . . . . . . . . . . . . . . . . . 27
3.4.1 Geometry Completion (ShapeFormer) . . . . . . . . . . . 27
3.4.2 3D Gaussian Completion (Shape-Aware G-MAE) . . . . . 28
3.5 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5.1 Addressing Domain Gap Issues . . . . . . . . . . . . . . 36
3.5.2 Appearance Refinement Strategies . . . . . . . . . . . . . 36
3.5.3 Integration and Final Output . . . . . . . . . . . . . . . . 36
4 Experiments 39
4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 ShapeNet Dataset . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 ShapeSplat Dataset . . . . . . . . . . . . . . . . . . . . . 39
4.1.3 Replica Dataset . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . 41
4.2.1 3D Gaussian Splatting Configuration . . . . . . . . . . . 41
4.2.2 ShapeFormer Configuration . . . . . . . . . . . . . . . . 41
4.2.3 Gaussian-MAE Configuration . . . . . . . . . . . . . . . 42
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.1 ShapeSplat Objects Results . . . . . . . . . . . . . . . . . 43
4.3.2 Replica Scenes Results . . . . . . . . . . . . . . . . . . . 45
4.3.3 Computational Performance Analysis . . . . . . . . . . . 52
4.4 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Comparison with Related Methods . . . . . . . . . . . . . . . . . 54
4.5.1 Comparison with Gaussian Object Carver (GOC) . . . . . 54
4.5.2 Comparison with Sparse-View 3D Gaussian Splatting . . 54
5 Conclusion 57
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . 58
5.2.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . 58
Reference 61
-
dc.language.isoen-
dc.subject三維高斯散射zh_TW
dc.subject室內場景重建zh_TW
dc.subject遮罩自編碼器zh_TW
dc.subject物體層級分解zh_TW
dc.subject場景補全zh_TW
dc.subjectScene Completionen
dc.subjectObject-Level Decompositionen
dc.subjectMasked Autoencoderen
dc.subject3D Gaussian Splattingen
dc.subjectIndoor Scene Reconstructionen
dc.title基於形狀感知的室內場景三維高斯散射重建與補全方法zh_TW
dc.titleFilling the Gaps: Shape-Aware Completion of 3D Gaussian Splatting for Indoor Scenesen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陳駿丞;陳永耀;盧奕璋zh_TW
dc.contributor.oralexamcommitteeJun-Cheng Chen;Yung-Yao Chen;Yi-Chang Luen
dc.subject.keyword室內場景重建,三維高斯散射,場景補全,物體層級分解,遮罩自編碼器,zh_TW
dc.subject.keywordIndoor Scene Reconstruction,3D Gaussian Splatting,Scene Completion,Object-Level Decomposition,Masked Autoencoder,en
dc.relation.page64-
dc.identifier.doi10.6342/NTU202503130-
dc.rights.note未授權-
dc.date.accepted2025-08-05-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電子工程學研究所-
dc.date.embargo-liftN/A-
顯示於系所單位:電子工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
24.45 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved