請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/82189完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 傅立成(Li-Chen Fu) | |
| dc.contributor.author | Chia-Hung Wang | en |
| dc.contributor.author | 王嘉鴻 | zh_TW |
| dc.date.accessioned | 2022-11-25T06:33:24Z | - |
| dc.date.copyright | 2021-11-09 | |
| dc.date.issued | 2021 | |
| dc.date.submitted | 2021-08-19 | |
| dc.identifier.citation | [1] Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles, 2021. [2] D. G. Lowe, 'Object recognition from local scale-invariant features,' in Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999. [3] N. Dalal and B. Triggs, 'Histograms of oriented gradients for human detection,' in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005. [4] M. D. Zeiler and R. Fergus, 'Visualizing and Understanding Convolutional Networks,' in Computer Vision – ECCV 2014, Cham, 2014. [5] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, 'Gradient-based learning applied to document recognition,' Proceedings of the IEEE, vol. 86, pp. 2278-2324, 1998. [6] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn and A. Zisserman, 'The Pascal Visual Object Classes (VOC) Challenge,' International Journal of Computer Vision, vol. 88, p. 303–338, 2010. [7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, 'ImageNet: A large-scale hierarchical image database,' in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. [8] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár and C. L. Zitnick, 'Microsoft COCO: Common Objects in Context,' in Computer Vision – ECCV 2014, Cham, 2014. [9] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele, 'The Cityscapes Dataset for Semantic Urban Scene Understanding,' in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [10] A. Geiger, P. Lenz and R. Urtasun, 'Are we ready for autonomous driving? The KITTI vision benchmark suite,' in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012. [11] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, 'You Only Look Once: Unified, Real-Time Object Detection,' in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [12] J. Redmon and A. Farhadi, 'YOLO9000: Better, Faster, Stronger,' in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [13] J. Redmon and A. Farhadi, YOLOv3: An Incremental Improvement, 2018. [14] A. Bochkovskiy, C.-Y. Wang and H.-Y. M. Liao, YOLOv4: Optimal Speed and Accuracy of Object Detection, 2020. [15] R. Girshick, J. Donahue, T. Darrell and J. Malik, 'Region-Based Convolutional Networks for Accurate Object Detection and Segmentation,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, pp. 142-158, 2016. [16] R. Girshick, 'Fast R-CNN,' in 2015 IEEE International Conference on Computer Vision (ICCV), 2015. [17] S. Ren, K. He, R. Girshick and J. Sun, 'Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, pp. 1137-1149, 2017. [18] K. He, G. Gkioxari, P. Dollár and R. Girshick, 'Mask R-CNN,' in 2017 IEEE International Conference on Computer Vision (ICCV), 2017. [19] D. L. Hall and J. Llinas, 'An introduction to multisensor data fusion,' Proceedings of the IEEE, vol. 85, pp. 6-23, 1997. [20] A. H. Gunatilaka and B. A. Baertlein, 'Feature-level and decision-level fusion of noncoincidently sampled sensors for land mine detection,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, pp. 577-589, 2001. [21] D. Lahat, T. Adali and C. Jutten, 'Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects,' Proceedings of the IEEE, vol. 103, pp. 1449-1477, 2015. [22] C. R. Qi, W. Liu, C. Wu, H. Su and L. J. Guibas, 'Frustum PointNets for 3D Object Detection from RGB-D Data,' in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. [23] R. Q. Charles, H. Su, M. Kaichun and L. J. Guibas, 'PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,' in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [24] S. Pang, D. Morris and H. Radha, 'CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection,' in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020. [25] C. R. Qi, L. Yi, H. Su and L. J. Guibas, 'PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space,' in Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 2017. [26] B. Yang, W. Luo and R. Urtasun, 'PIXOR: Real-time 3D Object Detection from Point Clouds,' in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. [27] Y. Zhou and O. Tuzel, 'VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection,' in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. [28] M. Liang, B. Yang, Y. Chen, R. Hu and R. Urtasun, 'Multi-Task Multi-Sensor Fusion for 3D Object Detection,' in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [29] M. Liang, B. Yang, S. Wang and R. Urtasun, 'Deep Continuous Fusion for Multi-sensor 3D Object Detection,' in Computer Vision – ECCV 2018, Cham, 2018. [30] W. Bao, B. Xu and Z. Chen, 'MonoFENet: Monocular 3D Object Detection With Feature Enhancement Networks,' IEEE Transactions on Image Processing, vol. 29, pp. 2753-2765, 2020. [31] M. ng, Y. Huo, H. Yi, Z. Wang, J. Shi, Z. Lu and P. Luo, 'Learning Depth-Guided Convolutions for Monocular 3D Object Detection,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020. [32] Y. Chen, S. Liu, X. Shen and J. Jia, 'DSGN: Deep Stereo Geometry Network for 3D Object Detection,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [33] J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou and H. Bao, 'Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [34] M. Simon, S. Milz, K. Amende and H.-M. Gross, 'Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds,' in Computer Vision – ECCV 2018 Workshops, Cham, 2019. [35] X. Chen, H. Ma, J. Wan, B. Li and T. Xia, 'Multi-view 3D Object Detection Network for Autonomous Driving,' in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [36] J. Ku, M. Mozifian, J. Lee, A. Harakeh and S. L. Waslander, 'Joint 3D Proposal Generation and Object Detection from View Aggregation,' in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018. [37] D.-S. Hong, H.-H. Chen, P.-Y. Hsiao, L.-C. Fu and S.-M. Siao, 'CrossFusion net: Deep 3D object detection based on RGB images and point clouds in autonomous driving,' Image and Vision Computing, vol. 100, p. 103955, 2020. [38] S. Ioffe and C. Szegedy, 'Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,' in Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, Lille, 2015. [39] J. L. Ba, J. R. Kiros and G. E. Hinton, Layer Normalization, 2016. [40] D. Ulyanov, A. Vedaldi and V. Lempitsky, Instance Normalization: The Missing Ingredient for Fast Stylization, 2017. [41] Y. Wu and K. He, 'Group normalization,' in Proceedings of the European conference on computer vision (ECCV), 2018. [42] V. Nair and G. E. Hinton, 'Rectified Linear Units Improve Restricted Boltzmann Machines,' in Proceedings of the 27th International Conference on International Conference on Machine Learning, Madison, 2010. [43] J. Kiefer and J. Wolfowitz, 'Stochastic Estimation of the Maximum of a Regression Function,' The Annals of Mathematical Statistics, vol. 23, p. 462 – 466, 1952. [44] D. P. Kingma and J. Ba, 'Adam: A Method for Stochastic Optimization,' in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [45] I. Sobel and G. Feldman, 'A 3×3 isotropic gradient operator for image processing,' Pattern Classification and Scene Analysis, pp. 271-272, 1 1973. [46] J. Canny, 'A Computational Approach to Edge Detection,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vols. PAMI-8, pp. 679-698, 1986. [47] M. Lin, Q. Chen and S. Yan, Network In Network, 2014. [48] A. Krizhevsky, I. Sutskever and G. E. Hinton, 'ImageNet Classification with Deep Convolutional Neural Networks,' Commun. ACM, vol. 60, p. 84–90, 5 2017. [49] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever and R. Salakhutdinov, 'Dropout: A Simple Way to Prevent Neural Networks from Overfitting,' J. Mach. Learn. Res., vol. 15, p. 1929–1958, 1 2014. [50] K. Simonyan and A. Zisserman, 'Very Deep Convolutional Networks for Large-Scale Image Recognition,' in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [51] K. He, X. Zhang, S. Ren and J. Sun, 'Deep Residual Learning for Image Recognition,' in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [52] S. Xie, R. Girshick, P. Dollár, Z. Tu and K. He, 'Aggregated Residual Transformations for Deep Neural Networks,' in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [53] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, 'Going deeper with convolutions,' in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [54] P. Viola and M. Jones, 'Rapid object detection using a boosted cascade of simple features,' in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001. [55] P. F. Felzenszwalb, R. B. Girshick, D. McAllester and D. Ramanan, 'Object Detection with Discriminatively Trained Part-Based Models,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 1627-1645, 2010. [56] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang and H. Li, 'PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [57] Z. Cai and N. Vasconcelos, 'Cascade R-CNN: High Quality Object Detection and Instance Segmentation,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, pp. 1483-1498, 2021. [58] J. R. Uijlings, K. E. Sande, T. Gevers and A. W. Smeulders, 'Selective Search for Object Recognition,' Int. J. Comput. Vision, vol. 104, p. 154–171, 9 2013. [59] B. Graham, M. Engelcke and L. v. d. Maaten, '3D Semantic Segmentation with Submanifold Sparse Convolutional Networks,' in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. [60] B. Graham and L. van der Maaten, Submanifold Sparse Convolutional Networks, 2017. [61] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, 'Feature Pyramid Networks for Object Detection,' in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [62] W. Zheng, W. Tang, S. Chen, L. Jiang and C.-W. Fu, 'CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud,' Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 3555-3562, 5 2021. [63] Y. Yan, Y. Mao and B. Li, 'SECOND: Sparsely Embedded Convolutional Detection,' Sensors, vol. 18, 2018. [64] C. He, H. Zeng, J. Huang, X.-S. Hua and L. Zhang, 'Structure Aware Single-Stage 3D Object Detection From Point Cloud,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [65] J. Deng, S. Shi, P. Li, W. Zhou, Y. Zhang and H. Li, 'Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection,' Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1201-1209, 5 2021. [66] T.-Y. Lin, P. Goyal, R. Girshick, K. He and P. Dollár, 'Focal Loss for Dense Object Detection,' in 2017 IEEE International Conference on Computer Vision (ICCV), 2017. [67] B. Li, W. Ouyang, L. Sheng, X. Zeng and X. Wang, 'GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving,' in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [68] X. Chen, K. Kundu, Y. Zhu, A. G. Berneshawi, H. Ma, S. Fidler and R. Urtasun, '3D Object Proposals for Accurate Object Class Detection,' in Advances in Neural Information Processing Systems, 2015. [69] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai and S. Chintala, 'PyTorch: An Imperative Style, High-Performance Deep Learning Library,' in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox and R. Garnett, Eds., Curran Associates, Inc., 2019, p. 8024–8035. [70] R. Collobert, S. Bengio and J. Marithoz, Torch: A Modular Machine Learning Software Library, 2002. [71] C. R. Harris, K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, R. Kern, M. Picus, S. Hoyer, M. H. van Kerkwijk, M. Brett, A. Haldane, J. F. del Río, M. Wiebe, P. Peterson, P. Gérard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke and T. E. Oliphant, 'Array programming with NumPy,' Nature, vol. 585, p. 357–362, 9 2020. [72] O. D. Team, OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds, 2020. [73] K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, Z. Zhang, D. Cheng, C. Zhu, T. Cheng, Q. Zhao, B. Li, X. Lu, R. Zhu, Y. Wu, J. Dai, J. Wang, J. Shi, W. Ouyang, C. C. Loy and D. Lin, MMDetection: Open MMLab Detection Toolbox and Benchmark, 2019. [74] T. Huang, Z. Liu, X. Chen and X. Bai, 'EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection,' in Computer Vision – ECCV 2020, Cham, 2020. [75] J. Deng and K. Czarnecki, 'MLOD: A multi-view 3D object detection based on robust feature fusion method,' in 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019. [76] S. Vora, A. H. Lang, B. Helou and O. Beijbom, 'PointPainting: Sequential Fusion for 3D Object Detection,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [77] X. Du, M. H. Ang, S. Karaman and D. Rus, 'A General Pipeline for 3D Detection of Vehicles,' in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018. [78] L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai and X. He, 'PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module,' in Proceedings of the AAAI Conference on Artificial Intelligence, 2020. [79] Z. Wang and K. Jia, 'Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection,' in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019. [80] G. Wang, B. Tian, Y. Zhang, L. Chen, D. Cao and J. Wu, Multi-View Adaptive Fusion Network for 3D Object Detection, 2020. [81] H. Zhang, D. Yang, E. Yurtsever, K. A. Redmill and Ü. Özgüner, Faraway-Frustum: Dealing with Lidar Sparsity for 3D Object Detection using Fusion, 2021. [82] J. H. Yoo, Y. Kim, J. Kim and J. W. Choi, '3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection,' in Computer Vision – ECCV 2020, Cham, 2020. [83] K. Shin and M. Tomizuka, epBRM: Improving a Quality of 3D Object Detection using End Point Box Regression Module, 2020. [84] L. Wang, C. Wang, X. Zhang, T. Lan and J. Li, S-AT GCN: Spatial-Attention Graph Convolution Network based Feature Enhancement for 3D Object Detection, 2021. [85] S. Shi, X. Wang and H. Li, 'PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud,' in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [86] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang and O. Beijbom, 'PointPillars: Fast Encoders for Object Detection From Point Clouds,' in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [87] Y. He, G. Xia, Y. Luo, L. Su, Z. Zhang, W. Li and P. Wang, 'DVFENet: Dual-branch voxel feature extraction network for 3D object detection,' Neurocomputing, vol. 459, pp. 201-211, 2021. [88] Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou and X. Bai, 'TANet: Robust 3D Object Detection from Point Clouds with Triple Attention,' Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11677-11684, 4 2020. [89] Z. Yang, Y. Sun, S. Liu, X. Shen and J. Jia, 'STD: Sparse-to-Dense 3D Object Detector for Point Cloud,' in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019. [90] W. Shi and R. Rajkumar, 'Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [91] S. Shi, Z. Wang, J. Shi, X. Wang and H. Li, 'From Points to Parts: 3D Object Detection From Point Cloud With Part-Aware and Part-Aggregation Network,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, pp. 2647-2664, 2021. [92] J. Li, Y. Sun, S. Luo, Z. Zhu, H. Dai, A. S. Krylov, Y. Ding and L. Shao, 'P2V-RCNN: Point to Voxel Feature Learning for 3D Object Detection From Point Clouds,' IEEE Access, vol. 9, pp. 98249-98260, 2021. [93] Z. Yang, Y. Sun, S. Liu and J. Jia, '3DSSD: Point-Based 3D Single Stage Object Detector,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [94] P. Bhattacharyya, C. Huang and K. Czarnecki, SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection, 2021. [95] Q. Chen, L. Sun, Z. Wang, K. Jia and A. Yuille, 'Object as hotspots: An anchor-free 3d object detection approach via firing of hotspots,' in European Conference on Computer Vision, 2020. [96] Y. Ye, C. Zhang and X. Hao, 'ARPNET: attention region proposal network for 3D object detection,' Science China Information Sciences, vol. 62, p. 1–3, 2019. [97] M. Ye, S. Xu and T. Cao, 'HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection,' in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/82189 | - |
| dc.description.abstract | 傳統上,汽車工業為硬體導向產業,近年來,深度學習爆炸性的成長,將電腦視覺推向嶄新的境界。因此,自動駕駛的可能性不再是不可觸及的夢想。其中自動駕駛的第一環節即是感知環境,常見的感測器如光達、攝影機等,利用各種感測器的提供的資訊偵測出道路上的物件。然而各種感測器皆有其優勢及劣勢,因此多感測器融合是一個有效提升偵測結果的方式。本論文旨在提出一種深度學習方法,整合光達點雲與攝影機影像特徵並且偵測出三維道路物件。 本論文提出一種創新且基於融合的三維物件偵測網路,稱作「體素與像素融合網路」。位於其中的「體素與像素融合層」包含了「參數特徵生成」、「基於參數的權重調整」和「體素與像素融合」,共三個模組,能根據幾何關係雙向地融合一對體素與像素的特徵。此外,我們提出的「體素與像素參數」能考慮每一對體素與像素的特性,並加強融合效果。 本研究使用知名的KITTI資料集訓練模型,並於官方未公開標註的測試集評估。實驗結果顯示本方法在多類別與多難度的三維物件偵測平均精確度達到65.99%。值得注意的是,我們的方法在KITTI排行榜中具有挑戰性的行人類別位居第一名。 | zh_TW |
| dc.description.provenance | Made available in DSpace on 2022-11-25T06:33:24Z (GMT). No. of bitstreams: 1 U0001-2507202117231500.pdf: 4116803 bytes, checksum: 61c57782e835ba55b520490478e08ffa (MD5) Previous issue date: 2021 | en |
| dc.description.tableofcontents | 摘要 i Abstract iii Table of Contents v List of Figures ix List of Tables xii Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 7 1.3 Challenge 8 1.4 Objective 9 1.5 Related Work 10 1.5.1 Camera-Based 3D Object Detector 10 1.5.2 LiDAR-Based 3D Object Detector 11 1.5.3 Fusion-Based 3D Object Detector 12 1.6 Contribution 13 1.7 Thesis Organization 14 Chapter 2 Preliminaries 17 2.1 Artificial Neural Network 17 2.1.1 Fully Connected Layer 18 2.1.2 Normalization Layer 19 2.1.3 Activation Function 20 2.1.4 Backpropagation 23 2.1.5 Stochastic Gradient Descent Optimizer 24 2.1.6 Adam Optimizer 24 2.2 Convolutional Neural Network 25 2.2.1 Convolutional Layer 25 2.2.2 Pooling Layer 28 2.2.3 Normalization Layer 31 2.2.4 Fully Connected Layer 31 2.3 Image Classification Framework 32 2.3.1 AlexNet 32 2.3.2 VGGNet 33 2.3.3 ResNet 33 2.3.4 ResNeXt 34 2.4 Object Detection Framework 35 2.4.1 Faster R-CNN 36 2.4.2 Cascade R-CNN 36 2.4.3 PV-RCNN 37 Chapter 3 Voxel-Pixel Fusion Network 41 3.1 Problem Formulation 41 3.2 Architecture Overview 43 3.3 Pre-processing 44 3.4 Backbone Network 47 3.4.1 Backbone Network in LiDAR Stream 47 3.4.2 Backbone Network in Camera Stream 49 3.5 Voxel-Pixel Fusion Layer 50 3.5.1 Parameter Feature Generating Module 52 3.5.2 Parameter-Based Weighting Module 56 3.5.3 Voxel-Pixel Fusion Module 58 3.6 Region Proposal Network 61 3.6.1 Region Proposal Network in LiDAR Stream 62 3.6.2 Region Proposal Network in Camera Stream 63 3.7 Detection Network 63 3.7.1 Detection Network in LiDAR Stream 64 3.7.2 Detection Network in Camera Stream 65 3.8 Loss Function 66 Chapter 4 Experiment 71 4.1 KITTI Vision Benchmark 71 4.2 Evaluation Metric 74 4.2.1 Intersection over Union 74 4.2.2 Mean Average Precision 75 4.3 Implementation Detail 78 4.3.1 Experimental Platform 78 4.3.2 Training Detail 79 4.4 Ablation Study 80 4.4.1 Effect of Modules within VPF Layer 81 4.4.2 Effect of Arrangement of VPF Layers 84 4.5 Quantitative Result 87 4.5.1 3D Object Detection 88 4.5.2 BEV Object Detection 92 4.6 Qualitative Result 96 4.6.1 Visualization 97 4.6.2 Precision-Recall Curve 99 4.7 Discussion 102 Chapter 5 Conclusion 107 Reference 109 | |
| dc.language.iso | en | |
| dc.subject | 電腦視覺 | zh_TW |
| dc.subject | 感知融合 | zh_TW |
| dc.subject | 自動駕駛 | zh_TW |
| dc.subject | 三維物件偵測 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 人工智慧 | zh_TW |
| dc.subject | deep learning | en |
| dc.subject | artificial intelligence | en |
| dc.subject | autonomous driving | en |
| dc.subject | computer vision | en |
| dc.subject | sensor fusion | en |
| dc.subject | 3D object detection | en |
| dc.title | 應用於三維道路物件偵測之體素與像素融合網路 | zh_TW |
| dc.title | Voxel-Pixel Fusion Network for 3D On-road Object Detection | en |
| dc.date.schoolyear | 109-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.coadvisor | 蕭培墉(Pei-Yung Hsiao) | |
| dc.contributor.oralexamcommittee | 傅楸善(Hsin-Tsai Liu),方瓊瑤(Chih-Yang Tseng),林忠緯 | |
| dc.subject.keyword | 感知融合,三維物件偵測,自動駕駛,電腦視覺,深度學習,人工智慧, | zh_TW |
| dc.subject.keyword | sensor fusion,3D object detection,autonomous driving,computer vision,deep learning,artificial intelligence, | en |
| dc.relation.page | 121 | |
| dc.identifier.doi | 10.6342/NTU202101725 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2021-08-20 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| dc.date.embargo-lift | 2026-08-01 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2507202117231500.pdf 未授權公開取用 | 4.02 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
