Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90134
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李明穗zh_TW
dc.contributor.advisorMing-Sui Leeen
dc.contributor.author陳柏維zh_TW
dc.contributor.authorBo-Wei Chenen
dc.date.accessioned2023-09-22T17:33:23Z-
dc.date.available2023-11-09-
dc.date.copyright2023-09-22-
dc.date.issued2023-
dc.date.submitted2023-08-11-
dc.identifier.citationX.Bai, Z.Hu,X.Zhu,Q.Huang,Y.Chen,H.Fu,andC.-L.Tai. Transfusion: Robust lidar-camera fusion for 3d object detection with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1090 1099, 2022.
D.Barnes,M.Gadd,P.Murcutt,P.Newman,andI.Posner. Theoxfordradarrobotcar dataset: A radar extension to the oxford robotcar dataset. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 6433–6438. IEEE, 2020.
A. Barrera, C. Guindel, J. Beltrán, and F. García. Birdnet+: End-to-end 3d object detection in lidar bird's eye view. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pages 1–6. IEEE, 2020.
H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom. nuscenes: A multimodal dataset for autonomous driv ing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020.
S.-Y. Chu and M.-S. Lee. Mt-detr: Robust end-to-end multimodal detection with confidence fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5252–5261, 2023.
O.-R. A. D. O. Committee. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles, sep 2016.
L. Dai, H. Liu, H. Tang, Z. Wu, and P. Song. Ao2-detr: Arbitrary-oriented ob ject detection transformer. IEEE Transactions on Circuits and Systems for Video Technology, 2022.
Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, 2021.
R. Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander. Joint 3d proposal gen eration and object detection from view aggregation. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1–8. IEEE, 2018.
A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12697–12705, 2019.
P. Li, P. Wang, K. Berntorp, and H. Liu. Exploiting temporal relations on radar perception for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17071–17080, 2022.
Y. Li, A. W. Yu, T. Meng, B. Caine, J. Ngiam, D. Peng, J. Shen, Y. Lu, D. Zhou, Q. V. Le, et al. Deepfusion: Lidar-camera deep fusion for multi-modal 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17182–17191, 2022.
T. Liang, H. Xie, K. Yu, Z. Xia, Z. Lin, Y. Wang, T. Tang, B. Wang, and Z. Tang. Bevfusion: Asimpleandrobustlidar-camera fusion framework. AdvancesinNeural Information Processing Systems, 35:10421–10434, 2022.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
I. Loshchilov and F. Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
J. Mao, Y. Xue, M. Niu, H. Bai, J. Feng, X. Liang, H. Xu, and C. Xu. Voxel transformer for 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3164–3173, 2021.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30, 2017.
G. Qian, Y. Li, H. Peng, J. Mai, H. Hammoud, M. Elhoseiny, and B. Ghanem. Point next: Revisiting pointnet++ with improved training and scaling strategies. Advances in Neural Information Processing Systems, 35:23192–23204, 2022.
K. Qian, S. Zhu, X. Zhang, and L. E. Li. Robust multimodal vehicle detection in foggy weather using complementary lidar and radar signals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 444 453, 2021.
C. Reading, A. Harakeh, J. Chae, and S. L. Waslander. Categorical depth distribu tion network for monocular 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8555–8564, 2021.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified, real-time object detection. InProceedingsoftheIEEEconferenceoncomputervision and pattern recognition, pages 779–788, 2016.
J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
M. Sheeny, E. De Pellegrin, S. Mukherjee, A. Ahrabian, S. Wang, and A. Wallace. Radiate: A radar dataset for automotive perception in bad weather. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 1–7. IEEE, 2021.
S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10529–10538, 2020.
X. Shi, Q. Ye, X. Chen, C. Chen, Z. Chen, and T.-K. Kim. Geometry-based distance decomposition for monocular 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15172–15181, 2021.
P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, et al. Scalability in perception for autonomous driv ing: Waymoopendataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020.
Z. Tian, C. Shen, H. Chen, and T. He. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, 2019.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
T.Wang,X.Zhu,J.Pang,andD.Lin. Fcos3d: Fullyconvolutionalone-stagemonoc ular 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 913–922, 2021.
Y. Wang, V. C. Guizilini, T. Zhang, Y. Wang, H. Zhao, and J. Solomon. Detr3d: 3d object detection from multi-view images via 3d-to-2d queries. In Conference on Robot Learning, pages 180–191. PMLR, 2022.
Z. Wang, W. Zhan, and M. Tomizuka. Fusing bird's eye view lidar point cloud and front view camera image for 3d object detection. In 2018 IEEE intelligent vehicles symposium (IV), pages 1–6. IEEE, 2018.
Z. Yang, Y. Sun, S. Liu, and J. Jia. 3dssd: Point-based 3d single stage object de tector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11040–11048, 2020.
S.Zhang,C.Chi,Y.Yao,Z.Lei,andS.Z.Li. Bridgingthegapbetweenanchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9759 9768, 2020.
Y. Zhou, L. Liu, H. Zhao, M. López-Benítez, L. Yu, and Y. Yue. Towards deep radar perception for autonomous driving: Datasets, methods, and challenges. Sensors, 22(11):4208, 2022.
Y.ZhouandO.Tuzel. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4490–4499, 2018.
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90134-
dc.description.abstract隨著深度學習技術的不斷發展,物件偵測的準確性也日益提高。自動駕駛 Level 5 的實現已經近在眼前。在良好的天氣條件下,物件偵測的平均精確度可以 高達百分之八十五以上。然而,天氣並非時時都理想,有時候會下雨、起霧,甚 至下雪,這種惡劣天氣會大幅降低物件偵測的準確性。 傳統的感測器,如攝像頭和LiDAR,都容易受到惡劣天氣的影響。因此,我 們採用RADAR和LiDAR的融合來進行物件偵測。RADAR在惡劣環境下不受影響,但會產生許多噪點雲。因此,我們需要使用LiDAR作為輔助,因為LiDAR 能提供精確的環境點雲信息,有助於減少虛擬偵測。我們使用注意力機制來融合LiDAR和RADAR的特徵。同時,我們提出了特 徵選取模塊(Feature Selection Module),解決了注意力機制中關注權重的問題。 此外,我們還提出了關聯融合模塊(Associative Feature Fusion Module),充分利 用注意力機制選取的特徵。通過實驗證明,我們提出的模型優於目前最先進的 RADAR和LiDAR模型。zh_TW
dc.description.abstractWith the continuous development of deep learning technology, the accuracy of object detection has been steadily improving. The realization of Level 5 autonomous driving is within reach. In favorable weather conditions, the average accuracy of object detection can reach over 85 percent. However, the weather is not always ideal, and conditions such as rain, fog, and even snow can significantly reduce the accuracy of object detection. Traditional sensors like cameras and LiDAR are susceptible to the influence of harsh weather conditions. Therefore, we adopt a fusion of RADAR and LiDAR for object de tection. RADAR is unaffected by adverse environmental conditions but introduces a lot of noisy point clouds. Hence, we utilize LiDAR as an auxiliary sensor because it provides accurate environmental point cloud information, which helps mitigate ghost detection. We employ an attention mechanism to fuse the features from LiDAR and RADAR. Additionally, we propose a Feature Selection Module to address the issue of attention weights in the attention mechanism. Furthermore, we introduce an Associative Feature Fusion Module to fully utilize the selected features from the attention mechanism. Through experiments, we demonstrate that our proposed model outperforms the state-of-the-art RADAR and LiDAR models.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-22T17:33:23Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-09-22T17:33:23Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents vi
List of Figures viii
List of Tables x
Chapter 1 Introduction 1
Chapter 2 Related Work 6
2.1bUnimodal Sensor Detection 6
2.1.1 Radar-Only Dtection 7
2.1.2 LiDAR-Only Detection 8
2.2 Multimodal Sensor Fusion Detection 9
2.2.1 Lidar and Camera Fusion Detection 10
2.2.2 Lidar and Radar Fusion Detection 10
Chapter 3 Method 12
3.1Framework Overview 13
3.2Feature Selection Module 14
3.2.1Feature Spatial Selection 14
3.2.2Feature Channel Selection 14
3.3Associative Feature Fusion Module 15
Chapter4 Experiments 17
4.1Dataset Description 17
4.2Implement Detail 18
4.2.1Evaluation Metrics 18
4.3Comparison with Only Radar 19
4.4Comparison with Only LiDAR 19
4.5Comparison with Radar and LiDAR 20
4.6Ablation Study 21
4.6.1Comparison with Other Feature Fusion Operation 21 4.6.2Ablation of Model Components 22
4.7Discussion 23
Chapter5 Conclusion 29
References 31
-
dc.language.isoen-
dc.subject多模態物件偵測zh_TW
dc.subject深度學習zh_TW
dc.subject基於注意力機制進行特徵融合zh_TW
dc.subjectdeep learningen
dc.subjectmultimodal object detectionen
dc.subjectfeature fusion based on attention mechanismen
dc.title惡劣天氣條件下基於關聯注意力機制融合雷達和光達進行物件偵測zh_TW
dc.titleFusion of Radar and LiDAR Using Associative Mechanism for Object Detection in Adverse Weather Conditionsen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee葉梅珍;李界羲zh_TW
dc.contributor.oralexamcommitteeMei-Chen Yeh;Li-Jie Xien
dc.subject.keyword深度學習,多模態物件偵測,基於注意力機制進行特徵融合,zh_TW
dc.subject.keyworddeep learning,multimodal object detection,feature fusion based on attention mechanism,en
dc.relation.page36-
dc.identifier.doi10.6342/NTU202302360-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2023-08-12-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf5.12 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved