基於多幀點雲對齊的三維物體偵測增強

王浚睿; Jun-Rui Wang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94712

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	施吉昇	zh_TW
dc.contributor.advisor	Chi-Sheng Shih	en
dc.contributor.author	王浚睿	zh_TW
dc.contributor.author	Jun-Rui Wang	en
dc.date.accessioned	2024-08-16T17:40:08Z	-
dc.date.available	2024-08-17	-
dc.date.copyright	2024-08-16	-
dc.date.issued	2024	-
dc.date.submitted	2024-08-09	-
dc.identifier.citation	A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 697–12 705. Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4490–4499. Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,”Sensors, vol. 18, no. 10, p. 3337, 2018. S. Shi, X. Wang, and H. Li, “Pointrcnn: 3d object proposal generation and detection from point cloud,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 770–779. W. Shi and R. Rajkumar, “Point-gnn: Graph neural network for 3d object detection in a point cloud,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 1711–1719. Z. Yang, Y. Sun, S. Liu, and J. Jia, “3dssd: Point-based 3d single stage object detector,”in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11 040–11 048. F. Ma and S. Karaman, “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” in 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 2018, pp. 4796–4803. S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10 529–10 538. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660. Z. Yang, Y. Zhou, Z. Chen, and J. Ngiam, “3d-man: 3d multi-frame attention network for object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 1863–1872. C. R. Qi, Y. Zhou, M. Najibi, P. Sun, K. Vo, B. Deng, and D. Anguelov, “Offboard 3d object detection from point cloud sequences,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6134–6144. Y. Chen and G. Medioni, “Object modelling by registration of multiple range images,”Image and vision computing, vol. 10, no. 3, pp. 145–155, 1992. A. Segal, D. Haehnel, and S. Thrun, “Generalized-icp.” in Robotics: science and systems, vol. 2, no. 4. Seattle, WA, 2009, p. 435. R. B. Rusu, N. Blodow, and M. Beetz, “Fast point feature histograms (fpfh) for 3d registration,” in 2009 IEEE international conference on robotics and automation. IEEE, 2009, pp. 3212–3217. Q.-Y. Zhou, J. Park, and V. Koltun, “Fast global registration,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer, 2016, pp. 766–782. H. Yang, J. Shi, and L. Carlone, “Teaser: Fast and certifiable point cloud registration,”IEEE Transactions on Robotics, vol. 37, no. 2, pp. 314–333, 2020. Y. Wang and J. M. Solomon, “Deep closest point: Learning representations for point cloud registration,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3523–3532. Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,” ACM Transactions on Graphics (tog), vol. 38, no. 5, pp. 1–12, 2019. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017. X. Zhang, W. Xu, C. Dong, and J. M. Dolan, “Efficient l-shape fitting for vehicle detection using laser scanners,” in 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2017, pp. 54–59.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94712	-
dc.description.abstract	三維物件偵測是自動駕駛系統中不可或缺的組件，負責定位和分類由感測器收集的點雲或深度影像中的物件。這項任務使得自動駕駛車輛和路邊單元能夠有效地感知其周圍環境。此外，後續的決策任務大量依賴於三維物件偵測的結果，因此其準確性直接影響自動駕駛系統的性能和安全性。近年來，大多數關於三維物件偵測的研究都利用深度神經網絡，需要標註的數據集來訓練模型。然而，在點雲中標註物件邊界框是一項耗時且具有挑戰性的任務。光達受到遮擋影響，只能提供環境的部分點雲。人工標註者發現很難在沒有其他感測器輔助的情況下為部分點雲標註完整的邊界框。在這份工作中，我們提出了一個點雲對齊流程，可以對齊稀疏的車輛點雲而無需任何標註數據，並從聚合的點雲中生成邊界框。我們的流程利用車輛的輪廓進行對齊，解決了基於特徵的配準方法難以解決的稀疏點雲對齊挑戰。該流程包括一個邊界框估算器，用於生成粗略的邊界框，基於這些粗略邊界框進行初始對齊，並結合點對點和面對面方法進行點雲配準。實驗結果顯示，我們的方法改善了邊界框的品質。在IoU閾值為0.7時，召回率提高了10%，而且在平移誤差和旋轉誤差方面也勝過基於特徵的配準方法。	zh_TW
dc.description.abstract	3D object detection is an essential component of autonomous driving systems, responsible for localizing and classifying the objects within the point clouds or depth images collected by sensors. This task enables self-driving vehicles and roadside units to effectively perceive their environment. Moreover, the subsequent tasks such as decision-making heavily relying on the results of 3D object detection. Therefore, the accuracy of 3D object detection directly influences the performance and safety of autonomous driving systems. Recently, most works on 3D object detection leverage deep neural networks, requiring annotated datasets to train models. However, annotating object bounding boxes in point clouds is a time-consuming and challenging task. LiDAR is affected by occlusions and can only provide partial views of the environment. Human annotators find it difficult to label complete bounding boxes for partial point clouds without the assistance of other sensors. In this work, we propose a point cloud alignment pipeline that can align sparse vehicle point clouds without requiring any annotated data and generate bounding boxes from aggregated point cloud. Our pipeline uses the vehicle's contour for alignment, addressing the sparse point cloud alignment challenge that feature-based registration methods struggle to solve. The pipeline comprises a bounding box estimator for generating rough bounding boxes, initial alignment based on these rough bounding boxes, and point cloud registration combining point-to-point and plane-to-plane methods. The experimental results show that our method improves the quality of bounding boxes. It achieves a 10% increase in recall at an IoU threshold of 0.7, and outperforms feature-based registration methods in terms of translation error and rotation error as well.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:40:08Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-08-16T17:40:08Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝 i 摘要 ii Abstract iii Contents v List of Figures viii List of Tables ix Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 3 1.3 Thesis Organization 3 Chapter 2 Background and Related Works 5 2.1 Background 5 2.2 Related Works 6 2.2.1 Multi-frame 3D Object Detection 6 2.2.2 Point Cloud Registraion 7 2.2.3 Unsupervised 3D Object Detection 9 Chapter 3 System Architecture and Problem Definition 10 3.1 System Architecture 10 3.2 Problem Definition 11 3.3 Challenges 12 Chapter 4 Design and Implementation 13 4.1 Workflow 13 4.2 Bounding Box Estimator 14 4.3 Initial Alignment 17 4.4 Point Cloud Registration 19 4.5 Final Bounding Box Generation 20 Chapter 5 Experiment Evaluation 21 5.1 Evaluation Dataset 21 5.2 Evaluation Metrics and Methodology 26 5.3 Quantitative Results 27 5.3.1 Translation Error and Rotation Error between Consecutive Frames 27 5.3.2 Translation Error between Non-consecutive Frames 28 5.3.3 Cover Rate Versus Translation Error 28 5.3.4 Average IoU and Recall on Carla Dataset 29 5.3.5 Average IoU and Recall on WAYSIDE Dataset 30 5.4 Time Complexity and Execution Time Analysis 30 5.4.1 Bounding Box Estimator 31 5.4.2 Initial Alignment 31 5.4.3 Point Cloud Registraion 32 5.5 Quantitative Results 33 Chapter 6 Conclusion 36 References 37	-
dc.language.iso	en	-
dc.title	基於多幀點雲對齊的三維物體偵測增強	zh_TW
dc.title	3D Object Detection Enhancement Based on Multi-frame Point Cloud Alignment	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	郭峻因;洪士灝;蔡欣穆;張瑞益	zh_TW
dc.contributor.oralexamcommittee	Jiun-In Guo;Shih-Hao Hung;Hsin-Mu Tsai;Ray-I Chang	en
dc.subject.keyword	三維物件偵測,點雲配準,多幀點雲,自駕車,無監督學習,	zh_TW
dc.subject.keyword	3D Object Detection,Point Cloud Registraion,Multi-frame Point Clouds,Autonomous Vehicles,Unsupervised Learning,	en
dc.relation.page	38	-
dc.identifier.doi	10.6342/NTU202403763	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2024-08-12	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf	23.43 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。