請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91519
標題: | 使用路側資料進行非監督式學習之三維物件偵測 Unsupervised Learning for 3D Point Cloud Object Detection using Roadside Dataset |
作者: | 吳泯駿 Min-Jyun Wu |
指導教授: | 施吉昇 Chi-Sheng Shih |
關鍵字: | 自駕車,非監督式學習,三維物件偵測,機器學習,深度學習, Autonomous Vehicle,Unsupervised Learning,3D Object Detection,Machine Learning,Deep Learning, |
出版年 : | 2023 |
學位: | 碩士 |
摘要: | 自動駕駛有望改善道路安全並減少交通擁堵。為了做出安全且有效率的決策,自動駕駛汽車必須首先感知並定位道路上的其他車輛。這就是為什麼 3D 物件偵測對於自動駕駛汽車至關重要的原因。它可以幫助確定自駕車周圍其他車輛的大小和位置。
目前,大多數關於 3D 物件偵測的方法都是以監督式學習為主,這意味著他們需要利用人們標記的高質量數據集來訓練深度學習網路。然而監督式學習有幾個問題,例如,當訓練資料和測試資料差異過大時,監督式模型的準確性就會下降。在這種情況下,必須在測試數據中進行人工標記並訓練後以達到原先的高精度。然而,在點雲中進行3D 人工標記通常被認為具有挑戰性的。與在圖像中進行人工標記相比,在點雲中獲取 3D 標記要困難得多。因為 2D 邊界框只須指出物件在影像中的 2D 位置以及大小,而 3D 邊界框則在位置及大小上都各多了一個維度,並且還需要指出物件的朝向。 為了解決上述問題,本篇論文提出了一種在沒有人工標記的情況下在路側點雲資料上生成 3D 標記的方法。該方法包含了生成初步偵測結果的第一步驟演算法,以及基於物件追蹤資訊去優化偵測結果的第二步驟演算法,最後則是第三步驟,使用自訓練來優化模型的表現。訓練後的模型則可以用來在不同型態的資料中進行 3D 物件偵測。 實驗結果表明,本篇論文所提出的方法可用於在沒有人工標記資料的情況下訓練出表現良好的 3D 物件偵測模型。在路側資料中,使用我們的方法訓練出的模型可在機車類別達到90.27%的AP,並在汽車類別達到91.33%的AP。此外,在 nuScenes 資料集上進行的實驗表明,與之前的非監督式物件偵測的模型相比,我們方法訓練的模型的性能提高了7.43%。 Autonomous driving promises to improve road safety and reduce traffic congestion. Self-driving cars must first perceive and localize other vehicles to make safe and efficient decisions. 3D Object Detection, a task to determine the size and location of other vehicles, is therefore critical for self-driving cars. Currently, most works on 3D Object Detection are supervised, meaning the deep-learning neural networks are trained on high-quality datasets annotated by humans. There are several problems with supervised work. For example, The accuracy of supervised learning models will deteriorate when the training and test data do not match, i.e., there are differences in the shapes, sizes, or even classes of vehicles between training and test data. In such cases, one must annotate 3D labels in the test data and train models with these labels to achieve high accuracy. However, annotating 3D labels in the point cloud is often considered challenging. Unlike getting 2D labels in an image, getting 3D labels in point clouds is far more complicated. The 2D bounding box only needs to indicate the 2D position and size of the object in the image. In contrast, the 3D bounding box has an additional dimension in the position and size and needs to indicate the object's orientation. This work proposes an algorithm to generate 3D labels on roadside point clouds without supervision to solve the above problems. The proposed method includes the first step to generate preliminary detection results, the second step to refine the preliminary detection results based on the tracking information, and the third step to boost the model's performance through Self Training. The trained model can then perform 3D Object Detection in real-time and in the different data domains. The experiment shows that the proposed methods could be used to train a robust model for 3D object detection without any supervision. In the roadside data, the model trained by the proposed method reaches 90.27% AP for scooters and 91.33% AP for cars. Also, the experiments conducted in nuScenes dataset demonstrate that the performance of the model trained by the proposed method is improved by 7.43%, compared to the model trained by the previous work of unsupervised object detection. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91519 |
DOI: | 10.6342/NTU202303102 |
全文授權: | 未授權 |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 目前未授權公開取用 | 20.89 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。