Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91519
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor施吉昇zh_TW
dc.contributor.advisorChi-Sheng Shihen
dc.contributor.author吳泯駿zh_TW
dc.contributor.authorMin-Jyun Wuen
dc.date.accessioned2024-01-28T16:21:42Z-
dc.date.available2024-01-29-
dc.date.copyright2024-01-27-
dc.date.issued2023-
dc.date.submitted2023-08-07-
dc.identifier.citation[1] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, “Pv-rcnn: Point-voxel feature set abstraction for 3d object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10 529–10 538.
[2] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013.
[3] Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,”Sensors, vol. 18, no. 10, p. 3337, 2018.
[4] Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4490–4499.
[5] J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang, and H. Li, “Voxel r-cnn: Towards high performance voxel-based 3d object detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 2, 2021, pp. 1201–1209.
[6] T. Yin, X. Zhou, and P. Krahenbuhl, “Center-based 3d object detection and tracking,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 11 784–11 793.
[7] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
[8] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.
[9] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum pointnets for 3d object detection from rgb-d data,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 918–927.
[10] S. Shi, X. Wang, and H. Li, “Pointrcnn: 3d object proposal generation and detection from point cloud,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 770–779.
[11] Z. Yang, Y. Sun, S. Liu, X. Shen, and J. Jia, “Std: Sparse-to-dense 3d object detector for point cloud,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1951–1960.
[12] R. Qian, X. Lai, and X. Li, “Boundary-aware 3d object detection from point clouds. arxiv 2021,” arXiv preprint arXiv:2104.10330.
[13] R. Kesten, M. Usman, J. Houston, T. Pandya, K. Nadhamuni, A. Ferreira, M. Yuan, B. Low, A. Jain, P. Ondruska, S. Omari, S. Shah, A. Kulkarni, A. Kazakova, C. Tao, L. Platinsky, W. Jiang, and V. Shet, “Level 5 perception dataset 2020,” https://level-5.global/level5/data/, 2019.
[14] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
[15] M. Ester, H.-P. Kriegel, J. Sander, X. Xu et al., “A density-based algorithm for discovering clusters in large spatial databases with noise.” in kdd, vol. 96, no. 34, 1996, pp. 226–231.
[16] X. Zhang, W. Xu, C. Dong, and J. M. Dolan, “Efficient l-shape fitting for vehicle detection using laser scanners,” in 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2017, pp. 54–59.
[17] Y. You, K. Luo, C. P. Phoo, W.-L. Chao, W. Sun, B. Hariharan, M. Campbell, and K. Q. Weinberger, “Learning to detect mobile objects from lidar scans without labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1130–1140.
[18] G. Welch, G. Bishop et al., “An introduction to the kalman filter,” 1995.
[19] H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in CVPR, 2020.
[20] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91519-
dc.description.abstract自動駕駛有望改善道路安全並減少交通擁堵。為了做出安全且有效率的決策,自動駕駛汽車必須首先感知並定位道路上的其他車輛。這就是為什麼 3D 物件偵測對於自動駕駛汽車至關重要的原因。它可以幫助確定自駕車周圍其他車輛的大小和位置。
目前,大多數關於 3D 物件偵測的方法都是以監督式學習為主,這意味著他們需要利用人們標記的高質量數據集來訓練深度學習網路。然而監督式學習有幾個問題,例如,當訓練資料和測試資料差異過大時,監督式模型的準確性就會下降。在這種情況下,必須在測試數據中進行人工標記並訓練後以達到原先的高精度。然而,在點雲中進行3D 人工標記通常被認為具有挑戰性的。與在圖像中進行人工標記相比,在點雲中獲取 3D 標記要困難得多。因為 2D 邊界框只須指出物件在影像中的 2D 位置以及大小,而 3D 邊界框則在位置及大小上都各多了一個維度,並且還需要指出物件的朝向。
為了解決上述問題,本篇論文提出了一種在沒有人工標記的情況下在路側點雲資料上生成 3D 標記的方法。該方法包含了生成初步偵測結果的第一步驟演算法,以及基於物件追蹤資訊去優化偵測結果的第二步驟演算法,最後則是第三步驟,使用自訓練來優化模型的表現。訓練後的模型則可以用來在不同型態的資料中進行 3D 物件偵測。
實驗結果表明,本篇論文所提出的方法可用於在沒有人工標記資料的情況下訓練出表現良好的 3D 物件偵測模型。在路側資料中,使用我們的方法訓練出的模型可在機車類別達到90.27%的AP,並在汽車類別達到91.33%的AP。此外,在 nuScenes 資料集上進行的實驗表明,與之前的非監督式物件偵測的模型相比,我們方法訓練的模型的性能提高了7.43%。
zh_TW
dc.description.abstractAutonomous driving promises to improve road safety and reduce traffic congestion. Self-driving cars must first perceive and localize other vehicles to make safe and efficient decisions. 3D Object Detection, a task to determine the size and location of other vehicles, is therefore critical for self-driving cars.
Currently, most works on 3D Object Detection are supervised, meaning the deep-learning neural networks are trained on high-quality datasets annotated by humans. There are several problems with supervised work. For example, The accuracy of supervised learning models will deteriorate when the training and test data do not match, i.e., there are differences in the shapes, sizes, or even classes of vehicles between training and test data. In such cases, one must annotate 3D labels in the test data and train models with these labels to achieve high accuracy. However, annotating 3D labels in the point cloud is often considered challenging. Unlike getting 2D labels in an image, getting 3D labels in point clouds is far more complicated. The 2D bounding box only needs to indicate the 2D position and size of the object in the image. In contrast, the 3D bounding box has an additional dimension in the position and size and needs to indicate the object's orientation.
This work proposes an algorithm to generate 3D labels on roadside point clouds without supervision to solve the above problems. The proposed method includes the first step to generate preliminary detection results, the second step to refine the preliminary detection results based on the tracking information, and the third step to boost the model's performance through Self Training. The trained model can then perform 3D Object Detection in real-time and in the different data domains.
The experiment shows that the proposed methods could be used to train a robust model for 3D object detection without any supervision. In the roadside data, the model trained by the proposed method reaches 90.27% AP for scooters and 91.33% AP for cars. Also, the experiments conducted in nuScenes dataset demonstrate that the performance of the model trained by the proposed method is improved by 7.43%, compared to the model trained by the previous work of unsupervised object detection.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-01-28T16:21:42Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-01-28T16:21:42Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
致謝 iii
摘要 iv
Abstract v
1 Introduction 1
1.1 Motivation 1
1.2 Contributions 3
1.3 Thesis Organization 4
2 Background and Related Works 5
2.1 Background 5
2.2 Related Works 6
2.2.1 Supervised 3D Object Detection 6
2.2.2 Unsupervised 3D Object Detection 8
2.3 Data Collection 9
3 Problem Definition and System Architecture 14
3.1 Problem Definition 14
3.2 System Architecture 16
4 Design and Implementation 18
4.1 Preliminary Object Detection and Tracking 20
4.1.1 Background Removal 20
4.1.2 Clustering by DBSCAN 21
4.1.3 BBox Fitting 22
4.1.4 Objects Tracking 26
4.2 Offline Tracking-Based BBox Refinement 28
4.2.1 Limitations and Observations of Preliminary Detection 28
4.2.2 Proposed Refinement Method 29
4.2.3 Anchor Finding 30
4.2.4 Anchor Fitting 32
4.3 Self Training 35
4.3.1 Motivation 35
4.3.2 Proposed Method 35
5 Experiment Evaluation 39
5.1 Evaluation Datasets 39
5.2 Evaluation Metrics and Methodology 40
5.3 Quantitative Results on RSU Dataset 41
5.3.1 Detection Improvements with Tracking-based Refinement 41
5.3.2 Detection Improvements with Self Training 42
5.3.3 Consistent Performance on Different RSU Dataset 45
5.3.4 Performance Comparison on Multiple LiDARs Data 46
5.4 Quantitative Results on nuScenes Dataset 46
5.4.1 Detection Improvements with Tracking-based Refinement 48
5.4.2 Detection Improvements with Self Training 50
5.4.3 Detection Performance Compared to Supervised Model 51
5.4.4 Analysis of Experimental Results 52
5.5 Qualitative Results 56
5.5.1 Visualized Result of Tracking-based Refinement 56
5.5.2 Visualized Result of Self Training 56
5.6 Ablation Studies 59
5.6.1 Effect of Tracking-based Refinement on Self Training Performance 59
5.6.2 Effect of Different Model Architectures on Performance 59
6 Conclusion 61
Bibliography 62
-
dc.language.isoen-
dc.subject深度學習zh_TW
dc.subject非監督式學習zh_TW
dc.subject三維物件偵測zh_TW
dc.subject機器學習zh_TW
dc.subject自駕車zh_TW
dc.subjectDeep Learningen
dc.subjectAutonomous Vehicleen
dc.subjectUnsupervised Learningen
dc.subject3D Object Detectionen
dc.subjectMachine Learningen
dc.title使用路側資料進行非監督式學習之三維物件偵測zh_TW
dc.titleUnsupervised Learning for 3D Point Cloud Object Detection using Roadside Dataseten
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee傅立成;林忠緯;涂嘉恒zh_TW
dc.contributor.oralexamcommitteeLi-Chen Fu;Chung-Wei Lin;Chia-Heng Tuen
dc.subject.keyword自駕車,非監督式學習,三維物件偵測,機器學習,深度學習,zh_TW
dc.subject.keywordAutonomous Vehicle,Unsupervised Learning,3D Object Detection,Machine Learning,Deep Learning,en
dc.relation.page63-
dc.identifier.doi10.6342/NTU202303102-
dc.rights.note未授權-
dc.date.accepted2023-08-09-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
  未授權公開取用
20.89 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved