基於可變形模板匹配之弱監督三維物體檢測

紀彥仰; Yan-Yang Ji

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88569

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	王鈺強	zh_TW
dc.contributor.advisor	Yu-Chiang Frank Wang	en
dc.contributor.author	紀彥仰	zh_TW
dc.contributor.author	Yan-Yang Ji	en
dc.date.accessioned	2023-08-15T16:52:40Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-08-15	-
dc.date.issued	2023	-
dc.date.submitted	2023-07-28	-
dc.identifier.citation	S. Zakharov, W. Kehl, A. Bhargava, and A. Gaidon, “Autolabeling 3d objects with differentiable rendering of sdf shape priors,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. vi, 1, 2, 6, 7, 17, 18, 19, 20 J. Leonard, J. How, S. Teller, M. Berger, S. Campbell, G. Fiore, L. Fletcher, E. Frazzoli, A. Huang, S. Karaman et al., “A perception-driven autonomous urban vehicle,” Journal of Field Robotics, 2008. 1 H. Gao, B. Cheng, J. Wang, K. Li, J. Zhao, and D. Li, “Object classification using cnn-based fusion of vision and lidar in autonomous vehicle environment,” IEEE Transactions on Industrial Informatics, 2018. 1 W. Lee, N. Park, and W. Woo, “Depth-assisted real-time 3d object detection for augmented reality,” in Proceedings of the International Conference on Artificial Reality and Telexistence, 2011. 1 S. Shi, X. Wang, and H. Li, “Pointrcnn: 3d object proposal generation and detection from point cloud,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 1, 4, 17, 20 A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 1, 4, 17 X. Ma, S. Liu, Z. Xia, H. Zhang, X. Zeng, and W. Ouyang, “Rethinking pseudo-lidar representation,” in Proceedings of the European Conference on Computer Vision (ECCV), 2020. 1, 4, 17 S. Song, S. P. Lichtenberg, and J. Xiao, “Sun rgb-d: A rgb-d scene understanding benchmark suite,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 1 H. Wang, Y. Cong, O. Litany, Y. Gao, and L. J. Guibas, “3dioumatch: Leveraging iou prediction for semi-supervised 3d object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. 1, 5 J. Wang, H. Gang, S. Ancha, Y.-T. Chen, and D. Held, “Semi-supervised 3d object detection via temporal graph neural networks,” in International Conference on 3D Vision (3DV), 2021. 1, 5 J. Yin, J. Fang, D. Zhou, L. Zhang, C.-Z. Xu, J. Shen, and W. Wang, “Semi-supervised 3d object detection with proficient teachers,” in Proceedings of the European Conference on Computer Vision (ECCV), 2022. 1, 5 Z. Qin, J. Wang, and Y. Lu, “Weakly supervised 3d object detection from point clouds,” in Proceedings of the ACM Conference on Multimedia (MM), 2020. 1, 2, 6, 7, 17, 18 Q. Meng, W. Wang, T. Zhou, J. Shen, L. Van Gool, and D. Dai, “Weakly supervised 3d object detection from lidar point cloud,” in Proceedings of the European Conference on Computer Vision (ECCV), 2020. 1, 5 Y. Wei, S. Su, J. Lu, and J. Zhou, “Fgr: Frustum-aware geometric reasoning for weakly supervised 3d vehicle detection,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2021. 1, 6, 7 R. McCraith, E. Insafutdinov, L. Neumann, and A. Vedaldi, “Lifting 2d object locations to 3d by discounting lidar outliers across objects and views,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2022. 1, 2, 6, 7, 10, 17, 18 L. Peng, S. Yan, B. Wu, Z. Yang, X. He, and D. Cai, “Weakm3d: Towards weakly supervised monocular 3d object detection,” in Proceedings of the International Conference on Learning Representations (ICLR), 2022. 1, 2, 6, 7, 17, 18 H. Wang, S. Sridhar, J. Huang, J. Valentin, S. Song, and L. J. Guibas, “Normalized object coordinate space for category-level 6d object pose and size estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 2, 6 J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, “Deepsdf: Learning continuous signed distance functions for shape representation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 2, 6 M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, 1981. 6 F. Lu, Z. Liu, X. Song, D. Zhou, W. Li, H. Miao, M. Liao, L. Zhang, B. Zhou, R. Yang et al., “Permo: Perceiving more at once from a single image for autonomous driving,” arXiv preprint arXiv:2007.08116, 2020. 8 S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Chemometrics and intelligent laboratory systems, 1987. 8 J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow twins: Self-supervised learning via redundancy reduction,” in Proceedings of the International Conference on Machine Learning (ICML), 2021. 14, 22 A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. 16 X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d object detection network for autonomous driving,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 16 A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” 2019. 16 C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 16 D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. 17 T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proceedings of the International Conference on Machine Learning (ICML), 2020. 21, 22	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88569	-
dc.description.abstract	三維物體偵測是三維視覺的一個熱門研究領域，近年來受到廣泛關注。然而，訓練用於三維物體偵測的深度學習模型通常需要大量帶有三維邊界框註釋的數據，這是一項耗時的任務並且存在重大挑戰。為了應對這一挑戰，我們提出了一種通過可變形模板匹配（DTMNet）進行弱監督三維物體偵測的方法，該方法在圖像和二維實例遮罩的弱監督下，通過將可變形形狀模板與輸入的LiDAR點雲進行匹配，生成弱監督的三維虛擬邊界框。生成的三維虛擬邊界框可以用於訓練基於圖像或基於LiDAR的三維物體偵測器。我們的DTMNet顯著降低了註釋成本，提高了三維物體偵測的效率。對KITTI基準數據集的實驗結果在定量和定性上證明了我們提出的模型的有效性和實用性	zh_TW
dc.description.abstract	3D object detection is an active research topic for 3D vision and has been widely studied in recent years. However, training deep learning models for 3D object detection typically requires extensive data with 3D bounding box annotations, which is a time-consuming task and presents a significant challenge. To address this challenge, we propose a weakly supervised 3D object detection method via deformable template matching (DTMNet), which generates weakly supervised 3D pseudo-bounding boxes by matching a deformable shape template with the input LiDAR point clouds under the weak supervision of images and 2D instance masks. The generated 3D pseudo-bounding boxes can be used to train either image-based or LiDAR-based 3D object detectors. Our DTMNet significantly reduces annotation costs and improves the efficiency of 3D object detection. Experimental results on the KITTI benchmark dataset quantitatively and qualitatively demonstrate the effectiveness and practicality of our proposed model.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-15T16:52:40Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-08-15T16:52:40Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	中文摘要 i Abstract ii List of Figures v List of Tables vii 1 Introduction 1 2 Related Work 4 2.1 Supervised 3D object detection 4 2.2 Semi-supervised 3D object detection 5 2.3 Weakly supervised 3D object detection 6 3 Proposed Method 8 3.1 Problem formulation and model overview 8 3.2 Weakly supervised deformable template matching 10 3.2.1 Segmentor and Predictor 10 3.2.2 Edge and color supervision 12 3.3 Training and obtaining pseudo-bounding box 15 4 Experiments 16 4.1 Dataset and implementation details 16 4.1.1 Dataset 16 4.1.2 Implementation Details 16 4.2 Weakly supervised 3D object detection 17 4.2.1 Quantitative evaluation 17 4.2.2 Qualitative result 18 4.3 Ablation Study 20 4.4 Additional Experiment Results 22 5 Conclusion 25 Reference 26	-
dc.language.iso	en	-
dc.title	基於可變形模板匹配之弱監督三維物體檢測	zh_TW
dc.title	Weakly Supervised 3D Object Detection via Deformable Template Matching	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	孫紹華;陳祝嵩	zh_TW
dc.contributor.oralexamcommittee	Shao-Hua Sun;Chu-Song Chen	en
dc.subject.keyword	物體偵測,三維視覺,點雲,	zh_TW
dc.subject.keyword	Object Detection,3D Vision,Point Cloud,	en
dc.relation.page	29	-
dc.identifier.doi	10.6342/NTU202302077	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2023-08-01	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電信工程學研究所	-
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	10.23 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。