基於輕量化基礎變換模型之顏色輔助點雲密度增強方法

林佑鑫; Yu-Hsin Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96960

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	施吉昇	zh_TW
dc.contributor.advisor	Chi-Sheng Shih	en
dc.contributor.author	林佑鑫	zh_TW
dc.contributor.author	Yu-Hsin Lin	en
dc.date.accessioned	2025-02-25T16:14:32Z	-
dc.date.available	2025-02-26	-
dc.date.copyright	2025-02-25	-
dc.date.issued	2024	-
dc.date.submitted	2025-02-06	-
dc.identifier.citation	[1] X. Ma, C. Qin, H. You, H. Ran, and Y. Fu, “Rethinking network design and local geometry in point cloud: A simple residual mlp framework,” arXiv preprint arXiv:2202.07123, 2022. [2] Z. J. Yew and G. H. Lee, “Regtr: End-to-end point cloud correspondences with transformers,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 6677–6686. [3] L. Downs, A. Francis, N. Koenig, B. Kinman, R. Hickman, K. Reymann, T. B. McHugh, and V. Vanhoucke, “Google scanned objects: A high-quality dataset of 3d scanned household items,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 2553–2560. [4] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015. [5] P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor fusion IV: control paradigms and data structures, vol. 1611. Spie, 1992, pp. 586–606. [6] S. Rusinkiewicz and M. Levoy, “Efficient variants of the icp algorithm,” in Proceedings third international conference on 3-D digital imaging and modeling. IEEE, 2001, pp. 145–152. [7] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981. [8] Q.-Y. Zhou, J. Park, and V. Koltun, “Fast global registration,” in Computer Vision ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer, 2016, pp. 766–782. [9] R. B. Rusu, N. Blodow, and M. Beetz, “Fast point feature histograms (fpfh) for 3d registration,” in 2009 IEEE international conference on robotics and automation. IEEE, 2009, pp. 3212–3217. [10] Y. Aoki, H. Goforth, R. A. Srivatsan, and S. Lucey, “Pointnetlk: Robust & efficient point cloud registration using pointnet,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 7163–7172. [11] V. Sarode, X. Li, H. Goforth, Y. Aoki, R. A. Srivatsan, S. Lucey, and H. Choset, “Pcrnet: Point cloud registration network using pointnet encoding,” arXiv preprint arXiv:1908.07906, 2019. [12] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660. [13] B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” in IJCAI’81: 7th international joint conference on Artificial intelligence, vol. 2, 1981, pp. 674–679. [14] Y. Wang and J. M. Solomon, “Deep closest point: Learning representations for point cloud registration,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3523–3532. [15] ——, “Prnet: Self-supervised learning for partial-to-partial registration,” Advances in neural information processing systems, vol. 32, 2019. [16] E. Brachmann, A. Krull, S. Nowozin, J. Shotton, F. Michel, S. Gumhold, and C. Rother, “Dsac-differentiable ransac for camera localization,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 6684–6692. [17] G. D. Pais, S. Ramalingam, V. M. Govindu, J. C. Nascimento, R. Chellappa, and P. Miraldo, “3dregnet: A deep neural network for 3d point registration,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7193–7203. [18] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017. [19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need.(nips), 2017,” arXiv preprint arXiv:1706.03762, vol. 10, p. S0140525X16001837, 2017. [20] R. T. Azuma, “A survey of augmented reality,” Presence: Teleoperators and Virtual Environments/MIT press, 1997. [21] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation, parallel distributed processing, explorations in the microstructure of cognition, ed. de rumelhart and j. mcclelland. vol. 1. 1986,” Biometrika, vol. 71, no. 599-607, p. 6, 1986. [22] I. Loshchilov, F. Hutter et al., “Fixing weight decay regularization in adam,” arXiv preprint arXiv:1711.05101, vol. 5, 2017.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96960	-
dc.description.abstract	點雲資料在三維任務中扮演著重要角色，但由於傳感器的硬體限制和環境因素，收集的點雲可能會稀疏或部分缺失，這給後續處理帶來了困難。因此，點雲密度增強方法在點雲資料處理領域中顯得尤為重要。過往研究提出的方法多基於點雲的幾何特徵進行操作和預測。本研究提出了CART (Color-Aided Registration Transformer) 點雲疊合模型，該模型結合了幾何和顏色特徵，能有效預測來源與目標點雲的重疊部分並預測轉換矩陣、對低重疊率的點雲進行精確疊合，從而增強點雲密度、以提升物件辨識的精度。在真實物體掃描的資料集 Google Scanned Objects 上，CART 模型達到了2.7度的旋轉誤差和0.66公分的平移誤差，與未使用顏色輔助的模型 RegTR 相比，在僅增加不到1%的模型尺寸的情況下，旋轉和平移誤差降低了20%至60%。此外，在大型 3D CAD 模擬資料集 ShapeNet 中，CART 模型達到了3.55度和0.77公分的誤差。特別是在來源和目標點雲存在大角度旋轉（如90度和180度）差異的情況下，CART 模型由於顏色特徵的輔助，對於對稱物體不受影響，極大幅度地降低了誤差，對比 RegTR 模型，誤差降低了80%。	zh_TW
dc.description.abstract	Point cloud data plays a crucial role in 3D tasks, but due to the limitations of sensors and environmental factors, the collected point clouds are usually sparse or partially occluded, making subsequent processing difficult. Therefore, point cloud density enhancement methods are significant for point cloud data processing. Previous studies have mostly operated and predicted the transformation based on geometric features of point clouds. This study proposes a point cloud registration model called CART (Color-Aided Registration Transformer), which combines geometric and color features. It use the overlapping parts between source and target point clouds and then predicts the transformation, effectively registering point clouds with low overlap and enhancing point cloud density. On the scanned object dataset, Google Scanned Objects, the CART model achieved a rotation error of 2.7 degrees and a translation error of 0.66 cm. Compared to the RegTR model, which does not use color assistance, the CART model reduced the rotational and translational errors by 20% to 60%, with an increase in model size of less than 1%. Additionally, on the large-scale 3D CAD simulation dataset, ShapeNet, the model limited the errors up to 3.55 degrees and 0.77 cm. Furthermore, in cases where there is a large rotational difference, such as 90 or 180 degrees, between the source and target point clouds, the CART model, aided by color features, significantly reduces the error for symmetric objects. Compared to RegTR, the error was reduced for up to 80%.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-25T16:14:32Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-02-25T16:14:32Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員審定書 i 誌謝 v 摘要 vii Abstract ix Contents xi List of Figures xv List of Tables xvii Chapter 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 2 Background and Related Works 7 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Traditional Point Cloud Registration Methods . . . . . . . . . . . . 10 2.2.2 Learning-Based Point Cloud Registration Methods . . . . . . . . . 11 2.2.3 Point Cloud Feature Extractor . . . . . . . . . . . . . . . . . . . . 12 2.2.4 Transformers Network Models . . . . . . . . . . . . . . . . . . . . 14 Chapter 3 System Architecture and Problem Definition 17 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Goals and Requirements . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Chapter 4 Design and Implementation 23 4.1 Networks Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.1.1 Correspondence Decoder . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.2 Rigid Transform Computation - SVD . . . . . . . . . . . . . . . . . 28 4.2 Training Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2.1 Google Scanned Objects . . . . . . . . . . . . . . . . . . . . . . . 30 4.2.2 ShapeNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.2.3 Comparison of GSO and ShapeNet Datasets . . . . . . . . . . . . . 33 4.3 Training Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.2 Training Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Chapter 5 Performance Evaluation 41 5.1 Evaluation Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2 Evaluation Metrics and Methodology . . . . . . . . . . . . . . . . . 43 5.2.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.3 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.3.1 Model Size and Complexity . . . . . . . . . . . . . . . . . . . . . . 47 5.3.2 Quantitative Results on the Real-World Scanned Asymmetric Dataset 48 5.3.3 Quantitative Results on the Real-World Scanned Symmetric Dataset 49 5.3.4 Quantitative Results on the Large-Scale Simulated Asymmetric Dataset 50 5.3.5 Quantitative Results on the Large-Scale Simulated Symmetric Dataset 51 5.3.6 Visual Representations of Performance . . . . . . . . . . . . . . . . 52 5.4 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.4.1 Results on Small Rotations . . . . . . . . . . . . . . . . . . . . . . 54 5.4.2 Results on Medium Rotations . . . . . . . . . . . . . . . . . . . . . 57 5.4.3 Results on Arbitrary Rotations . . . . . . . . . . . . . . . . . . . . 59 5.5 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Chapter 6 Conclusion 63 6.1 Summary of Key Findings . . . . . . . . . . . . . . . . . . . . . . . 63 6.2 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . 64 References 67	-
dc.language.iso	en	-
dc.subject	顏色輔助	zh_TW
dc.subject	基礎變換模型	zh_TW
dc.subject	自動駕駛	zh_TW
dc.subject	三維感知	zh_TW
dc.subject	點雲密度增強	zh_TW
dc.subject	3D Sensing	en
dc.subject	Point Cloud Density Enhancement	en
dc.subject	Color-Aided	en
dc.subject	Transformer	en
dc.subject	Autonomous Driving	en
dc.title	基於輕量化基礎變換模型之顏色輔助點雲密度增強方法	zh_TW
dc.title	Point Cloud Color-Aided Density Enhancement Using Lightweight Transformer Model	en
dc.type	Thesis	-
dc.date.schoolyear	113-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	洪士灝;張瑞益;郭峻因;蔡欣穆	zh_TW
dc.contributor.oralexamcommittee	Shih-Hao Hung;Ray-I Chang;Jiun-In Guo;Hsin-Mu Tsai	en
dc.subject.keyword	點雲密度增強,顏色輔助,基礎變換模型,自動駕駛,三維感知,	zh_TW
dc.subject.keyword	Point Cloud Density Enhancement,Color-Aided,Transformer,Autonomous Driving,3D Sensing,	en
dc.relation.page	69	-
dc.identifier.doi	10.6342/NTU202500472	-
dc.rights.note	未授權	-
dc.date.accepted	2025-02-07	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf 未授權公開取用	11.8 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。