Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84641
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor施吉昇(Chi-Sheng Shih)
dc.contributor.authorYi-Hung Kuoen
dc.contributor.author郭羿宏zh_TW
dc.date.accessioned2023-03-19T22:18:42Z-
dc.date.copyright2022-09-19
dc.date.issued2022
dc.date.submitted2022-09-14
dc.identifier.citation[1] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10 529–10 538. [2] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013. [3] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “Density-based spatial clustering of applications with noise,” in Int. Conf. Knowledge Discovery and Data Mining, vol. 240, no. 6, 1996. [4] Q. Xu, Y. Zhong, and U. Neumann, “Behind the curtain: Learning occluded shapes for 3d object detection,” arXiv preprint arXiv:2112.02205, 2021. [5] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 697–12 705. [6] A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su et al., “Shapenet: An information-rich 3d model repository,” arXiv preprint arXiv:1512.03012, 2015. [7] X. Han, Z. Li, H. Huang, E. Kalogerakis, and Y. Yu, “High-resolution shape completion using deep neural networks for global structure and local geometry inference,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 85–93. [8] C. B. Choy, D. Xu, J. Gwak, K. Chen, and S. Savarese, “3d-r2n2: A unified approach for single and multi-view 3d object reconstruction,” in European conference on computer vision. Springer, 2016, pp. 628–644. [9] M. Gadelha, R. Wang, and S. Maji, “Multiresolution tree networks for 3d point cloud processing,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 103–118. [10] G. Yang, X. Huang, Z. Hao, M.-Y. Liu, S. Belongie, and B. Hariharan, “Pointflow: 3d point cloud generation with continuous normalizing flows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4541–4550. [11] W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert, “Pcn: Point completion network,” in 2018 International Conference on 3D Vision (3DV). IEEE, 2018, pp. 728–737. [12] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660. [13] Q. Xu, Y. Zhou, W. Wang, C. R. Qi, and D. Anguelov, “Spg: Unsupervised domain adaptation for 3d object detection via semantic point generation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 446–15 456. [14] N. M. F. Salem PhD, “A survey on various image inpainting techniques,” Future Engineering Journal, vol. 2, no. 2, p. 1, 2021. [15] X. Zhan, X. Pan, B. Dai, Z. Liu, D. Lin, and C. C. Loy, “Self-supervised scene deocclusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3784–3792. [16] Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4490–4499. [17] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241. [18] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,” Sensors, vol. 18, no. 10, p. 3337, 2018. [19] B. Graham, M. Engelcke, and L. Van Der Maaten, “3d semantic segmentation with submanifold sparse convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9224–9232. [20] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE signal processing letters, vol. 23, no. 10, pp. 1499–1503, 2016. [21] S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. MarínJiménez, “Automatic generation and detection of highly reliable fiducial markers under occlusion,” Pattern Recognition, vol. 47, no. 6, pp. 2280–2292, 2014. [22] O. D. Team, “Openpcdet: An open-source toolbox for 3d object detection from point clouds,” https://github.com/open-mmlab/OpenPCDet, 2020. [23] Y. Yan, “traveller59 / spconv : Spconv: Spatially sparse convolution library.” [Online]. Available: https://github.com/xacrimon/dashmap [24] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988. [25] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448. [26] A. Simonelli, S. R. Bulo, L. Porzi, M. López-Antequera, and P. Kontschieder, “Disentangling monocular 3d object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1991–1999. [27] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” International journal of computer vision, vol. 88, no. 2, pp. 303–338, 2010.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84641-
dc.description.abstract使用光達點雲的三維物件偵測被廣泛的應用於自駕車以及路側系統的物件偵測與追蹤。然而使用單光達點雲進行偵測的表現會受到物件遮蔽的影響。許多過往的研究提出將物件點雲補齊來提升三維物件偵測的表現。多數研究的模型輸入為單一物件的點雲,需要額外的前處理將物件點雲從光達掃描中取出。有些研究則是以整個光達掃描點雲作為模型輸入,但仍是需要有物件的三維標記,而新場景的標記的取得耗時且困難。本研究利用路側系統的多光達掃描之特性,設計了一個自主標記的流程。該流程透過去背景之多光達融合點雲,使用DBSCAN演算法得到物件三維標記,並以此在多光達融合點雲上取得物件之多光達視角。補點模型的訓練則是將多光達點雲拆成單一光達點雲作為輸入,並以物件之多光達視角點雲作為訓練的輸出。此流程使得補點模型得以在沒有人工標記下完成補點,在體積像素(Voxel)IoU相對於沒補點前增加約17%,體積像素(Voxel) recall則是相對於沒補點前提升約22%。 DBSCAN的3D AP@IoU=0.25在經過補點後提升約20%,並超越使用雙光達作為輸入的結果。深度學系模型的偵測表現亦有提升,如:PointPillar及PV-RCNN在KITTI資料集中Easy類別的3DAP@IoU=0.5分別提升約3%和1%。zh_TW
dc.description.abstractThe 3D object detection using point clouds scanned by LiDARs has been widely used in applications such as self-driving vehicles and roadside vehicle detection and tracking. However, the detection performances using singleLiDAR-scanned point clouds suffer from occlusion. Several previous works proposed to complete the point cloud to improve 3D detection. Most works take the input of the object point clouds, which requires additional steps for processing the LiDAR-scanned point clouds. Some works complete object point clouds by taking the input of full LiDAR-scanned point clouds, but they require human-labeled 3D object bounding boxes, which are difficult and costly to obtain for new scenarios. This work designed a self-labeling method that exploits the characteristics of multiple LiDAR-scanned point clouds collected on roadside units. The self-labeling method labels the objects with DBSCAN using the fused point cloud scanned by multiple LiDARs that have been background-removed. The labels are then used to extract object points from the fused point clouds. The point completion model is trained using the single-LiDAR-scanned point cloud as input and using the extracted object points from the fused point cloud as the training target. This labeling method allows the point completion model to be trained without humanlabeled bounding boxes. The voxel IoU of the point-completed point clouds increased by 17% compared to that of the raw point clouds. The voxel recall of the point-completed point clouds increased by 22% compared to that of the raw point clouds. Additionally, the detection performance of DBSCAN, i.e., 3D AP@IoU=0.25 increased by 20% after applying the point completion algorithm. The deep-learning-based models also benefit from applying the algorithm. The 3D AP@IoU=0.5 for the easy targets in the KITTI dataset of PointPillar and PV-RCNN increased by about 3% and about 1%, respectively.en
dc.description.provenanceMade available in DSpace on 2023-03-19T22:18:42Z (GMT). No. of bitstreams: 1
U0001-1409202214243600.pdf: 90251231 bytes, checksum: 9917ed4aafdf57d776a8dc61f6e07bc5 (MD5)
Previous issue date: 2022
en
dc.description.tableofcontents口試委員會審定書 i 致謝 iii 摘要 iv Abstract v 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Background and Related Works 6 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 PointFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.2 Semantic Point Generation . . . . . . . . . . . . . . . . . . . . . 9 2.2.3 Behind the Curtain Detector . . . . . . . . . . . . . . . . . . . . 9 3 System Architecture and Problem Definition 10 3.1 System Architecture and Term Definition . . . . . . . . . . . . . . . . . 10 3.2 Goals and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.4 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4 Design and Implementation 15 4.1 Networks Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Non-occluded 3 Dimension Point Cloud Data . . . . . . . . . . . . . . . 17 4.2.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.2 Self-labeling Dataset Generation . . . . . . . . . . . . . . . . . . 25 4.3 Training Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3.2 Training Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3.3 Postprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5 Experiment Evaluation 38 5.1 Evaluation Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.2 Evaluation Metrics and Methodology . . . . . . . . . . . . . . . . . . . . 40 5.2.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.3 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.3.1 Voxel-wise point completion performance . . . . . . . . . . . . . 45 5.3.2 DBSCAN detector improvements with point completion model . 45 5.3.3 Point completion performance compared with PointFlow . . . . . 46 5.3.4 Deep-learning-based detector improvement with point completion 47 5.3.5 Impact on Weak Classification Performance by point completion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.4 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.4.1 Performance comparison between different point cloud inputs to the cluster-based algorithm . . . . . . . . . . . . . . . . . . . . . 50 5.4.2 Visual comparison between two different labeling-rule-labeled datasets 51 5.4.3 Performance of the Model in untrained RoI . . . . . . . . . . . . 53 5.5 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.5.1 Performance of Adding Point Cloud from Multiple LiDAR . . . . 56 5.5.2 Affect of different distance kernels on Performance . . . . . . . . 59 5.5.3 Affect of different voxel sizes on Performance . . . . . . . . . . 60 5.5.4 The Metrics Impact Resulting from Labeling Rules . . . . . . . . 61 6 Conclusion 64 6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Bibliography 66
dc.language.isozh-TW
dc.subject點雲補點zh_TW
dc.subject多光達zh_TW
dc.subject路側系統zh_TW
dc.subject自監督zh_TW
dc.subject基於體積像素zh_TW
dc.subjectRoadside uniten
dc.subjectSelf-superviseden
dc.subjectPoint cloud completionen
dc.subjectVoxel-baseden
dc.subjectMultiple LiDARsen
dc.title自監督使用單光達生成多光達物件點雲視角zh_TW
dc.titleSelf-supervised Multi-LiDAR Object View Generation using Single LiDARen
dc.typeThesis
dc.date.schoolyear110-2
dc.description.degree碩士
dc.contributor.oralexamcommittee傅立成(Li-Chen Fu),林忠緯(Chung-Wei Lin),涂嘉恒(Chia-Heng Tu)
dc.subject.keyword點雲補點,自監督,多光達,基於體積像素,路側系統,zh_TW
dc.subject.keywordPoint cloud completion,Self-supervised,Multiple LiDARs,Voxel-based,Roadside unit,en
dc.relation.page68
dc.identifier.doi10.6342/NTU202203392
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2022-09-16
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
dc.date.embargo-lift2022-09-19-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1409202214243600.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
88.14 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved