基於曲率變化之關鍵零件定位物件姿態估計

Zi-Wen Guo; 郭子文

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68905

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	于天立(Tian-Li Yu)
dc.contributor.author	Zi-Wen Guo	en
dc.contributor.author	郭子文	zh_TW
dc.date.accessioned	2021-06-17T02:41:21Z	-
dc.date.available	2020-08-24
dc.date.copyright	2020-08-24
dc.date.issued	2020
dc.date.submitted	2020-08-18
dc.identifier.citation	[1] A. Collet, D. Berenson, S. S. Srinivasa, and D. Ferguson. Object recognition and full pose registration from a single image for robotic manipulation. In 2009 IEEE International Conference on Robotics and Automation, pages 48–55, 2009. [2] S. Lee, L. Wei, and A. M. Naguib. Adaptive bayesian recognition and pose estimation of 3D industrial objects with optimal feature selection. In 2016 IEEE International Symposium on Assembly and Manufacturing, pages 50–55, 2016. [3] A. Mousavian, D. Anguelov, J. Flynn, and J. Koˇseck´a. 3D bounding box estimation using deep learning and geometry. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, pages 5632–5640, 2017. [4] J. J. Lim, H. Pirsiavash, and A. Torralba. Parsing ikea objects: Fine pose estimation. In 2013 IEEE International Conference on Computer Vision, pages 2992–2999, 2013. [5] A. Shahrokni, L. Vacchetti, V. Lepetit, and P. Fua. Polyhedral object detection and pose estimation for augmented reality applications. In 2002 Proceedings of Computer Animation 2002, pages 65–69, 2002. [6] L. He, S. Wang, and T. N. Pappas. 3D surface registration using Z-SIFT. In 2011 18th IEEE International Conference on Image Processing, pages 1985–1988, 2011. [7] K. Yamada and A. Kimura. A performance evaluation of keypoints detection methods sift for 3D reconstruction. In 2018 International Workshop on Advanced Image Technology, pages 1–4, 2018. [8] Z. Fang and S. Scherer. Real-time onboard 6dof localization of an indoor mav indegraded visual environments using a rgb-d camera. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 5253–5259, 2015. [9] Yoshifumi Kuriyama, Yasushi Yasuda, Ken’ichi Yano, and Masafumi Hamaguchi. Liquid container transfer control by hybrid shape approach for 6dof manipulator. In SICE Annual Conference 2007, pages 3069–3072, 2007. [10] M. Stark, M. Goesele, and B. Schiele. Back to the future: Learning shape models from 3D CAD data. pages 1–11, 01 2010. [11] B. Drost, M. Ulrich, N. Navab, and S. Ilic. Model globally, match locally: Efficient and robust 3D object recognition. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 998–1005,2010. [12] Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox. PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. ArXiv, 1711.00199, 2018. [13] S. Ilic S. Holzer G. Bradski K. Konolige S. Hinterstoisser, V. Lepetit and N. Navab. Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In In Asian Conference on Computer Vision, page 548–562, 2012. [14] Q. Dang, J. Yin, B. Wang, and W. Zheng. Deep learning based 2D human pose estimation: A survey. volume 24, pages 663–676, 2019. [15] Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Openpose: Realtime multi-person 2d pose estimation using part affinity fields. CoRR, abs/1812.08008, 2018. [16] Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. CoRR, abs/1506.02640, 2015. [17] R. B. Rusu, N. Blodow, and M. Beetz. Fast point feature histograms (fpfh) for 3d registration. In 2009 IEEE International Conference on Robotics and Automation, pages 3212–3217, 2009. [18] R. B. Rusu, G. Bradski, R. Thibaux, and J. Hsu. Fast 3d recognition and pose using the viewpoint feature histogram. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2155–2162, 2010. [19] Xian-Feng Han, Jesse S. Jin, Juan Xie, Ming-Jie Wang, and Wei Jiang. A comprehensive review of 3d point cloud descriptors. CoRR, abs/1802.02297, 2018. [20] S. Hinterstoisser, V. Lepetit, N. Rajkumar, and K. Konolige. Going further with point pair features. Lecture Notes in Computer Science, page 834–848, 2016. [21] C. Papazov and D. Burschka. An efficient ransac for 3D object recognition in noisy and occluded scenes. pages 135–148, 01 2010. [22] C. Huang, H. Ai, Y. Li, and S. Lao. Vector boosting for rotation invariant multiview face detection. In Tenth IEEE International Conference on Computer Vision, volume 1, pages 446–453 Vol. 1, 2005. [23] S. Hinterstoisser, S. Holzer, C. Cagniart, S. Ilic, K. Konolige, N. Navab, and V. Lepetit. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In 2011 International Conference on Computer Vision, pages 858–865, 2011. [24] C. Wang, D. Xu, Y. Zhu, R. Mart´ın-Mart´ın, C. Lu, L. Fei-Fei, and S. Savarese. DenseFusion: 6D object pose estimation by iterative dense fusion. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3338–3347, 2019. [25] Bugra Tekin, Sudipta N. Sinha, and Pascal Fua. Real-time seamless single shot 6d object pose prediction. CoRR, abs/1711.08848, 2017. [26] Mahdi Rad and Vincent Lepetit. BB8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. CoRR, abs/1703.10896, 2017. [27] S. Peng, Y. Liu, Q. Huang, X. Zhou, and H. Bao. PVNet: Pixel-wise voting network for 6DoF pose estimation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4556–4565, 2019. [28] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597, 2015. [29] S. Michael. A comprehensive introduction to differential geometry. volume 5, 1999. [30] D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. volume 24, pages 603–619, 2002. [31] M. Schoeler, J. Papon, and F. W¨org¨otter. Constrained planar cuts - object partitioning for point clouds. In 2015 IEEE Conference on Computer Vision and Pattern Recognition, pages 5207–5215, 2015. [32] B. Calli, A. Singh, J. Bruce, A. Walsman, K. Konolige, P. Abbeel S. Srinivasa, and A. M. Dollar. Yale-CMU-Berkeley dataset for robotic manipulation research. volume 36, pages 261 – 268, 2017. [33] A. V. Patil and P. Rabha. A survey on joint object detection and pose estimation using monocular vision. ArXiv, 1811.10216, 2018.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68905	-
dc.description.abstract	物件姿態估計技術在工業與虛擬技術等三維相關任務中被廣泛的運用，一直以來也是許多機構研究的重要課題之一，其中包含了機器手臂抓取任務、精密儀器的零件組裝及虛擬擴增實境的人與場景互動等等，根據不同的使用環境下所面臨的挑戰的重要性程度也會有所不同，如遮蔽、光影變化和速度等等。然而在傳統上，由於深度相機的價格較為昂貴且取得不易，因此有較多研究較著重在彩色影像的探討上，經典的方法如模板比對等。但隨著近年來深度相機技術的蓬勃發展，使得深度相機的成本顯著降低，這也讓越來越多的應用開始導入了深度資訊，使得機器從原本的二維彩色世界開始邁向了三維的世界。本論文提出一套六自由度物件姿態估計系統，系統的主要發想來自於由下而上的人體姿態估計方法，該系統首先基於一3D物體曲率的大小自動化地將其劃分出數個關鍵部件，接著通過各關鍵部件內的點彼此之間互相投票的機制，來找出位在每個關鍵部件中最核心的那個點，最後採用隨機抽樣一致演算法來進行姿態估計；此外，為了要更快速地找出關鍵部件，在本論文中也採用了二維辨識技術來進行加速，該方法不但能更快速地找出關鍵部件，並且也能考慮在曲率變化較大的區域之色彩資訊。在本論文中有測試在不同的情況下如遮蔽、光影變化、執行時間和多尺度下等實驗。	zh_TW
dc.description.abstract	Object pose estimation is widely used in industry, and 3D virtual technology and also has been one of the most critical studies by many institutions. Including robotic grasping, precision parts assembly and virtual reality, etc. The proportion of challenges faced in different user environments will also be different, such as occlusion, light and shadow changes, and speed. Traditionally, because depth cameras are expensive, they focus more on the discussion of color images. In recent years, more and more machines have begun to add depth information. We proposed an 6DoF object pose estimation method that combines depth information and color images. Ideas come from the bottom-up human pose estimation. The method can automatically divide an object into several key-component. Use Gaussian curvature to find the region of interest and group them. Then, finding the key regions through the voting mechanism in the component and localizing the center points in each area. In addition, for the requirement of speed, apply real-time two-dimensional detection techniques to optimize the method. Finally, using RANSAC to estimate the pose of the object. Some experiments such as occlusion, light and shadow changes, execution time and multi-scale show in this thesis.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T02:41:21Z (GMT). No. of bitstreams: 1 U0001-1708202002494300.pdf: 16220379 bytes, checksum: 3d4c21f4df2d556fbe35b7068db761b6 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	誌謝 i 摘要 ii Abstract iv 1 Introduction 2 2 Background 6 2.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 6-DoF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Bottom-up Human Pose Estimation Approach . . . . . . . . . . . . . 8 2.4 YOLO Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.5 Camera Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Related Work 13 3.1 Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Template-based Method . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3 End-to-End . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3.1 DenseFusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.2 PVNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Proposed Method 20 4.1 System Overview - Point Cloud Version . . . . . . . . . . . . . . . . . 20 4.2 Curvature-based Key-component Localization . . . . . . . . . . . . . 20 4.2.1 Region of Interesting . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.2 Voting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.3 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 Projection From 2D to 3D Space . . . . . . . . . . . . . . . . . . . . 27 4.4 System Overview - YOLO Version . . . . . . . . . . . . . . . . . . . . 28 4.5 Multi-view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.5.1 Corresponding Mapping . . . . . . . . . . . . . . . . . . . . . 30 5 Experiment 33 5.1 Metric Criteria - Average Distance of Model Points . . . . . . . . . . 33 5.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.3 The Result of Point Cloud Version on The Common Object . . . . . 34 5.4 The Result of YOLO Version on The Common Object Dataset . . . . 36 5.5 The Result on The Occlusion Drill . . . . . . . . . . . . . . . . . . . 37 5.6 The Result on The Occlusion Drill with Shadow . . . . . . . . . . . . 39 5.7 The Result on The Multi-scale . . . . . . . . . . . . . . . . . . . . . . 40 5.8 The Result on The Different Object . . . . . . . . . . . . . . . . . . . 42 5.9 The Result under The Complex Background . . . . . . . . . . . . . . 44 6 Conclusion 49 Bibliography 50
dc.language.iso	en
dc.subject	物件姿態估計	zh_TW
dc.subject	高斯曲率	zh_TW
dc.subject	由下而上方法	zh_TW
dc.subject	三維空間	zh_TW
dc.subject	Object pose estimation	en
dc.subject	Three-dimensional space	en
dc.subject	Bottom-up method	en
dc.subject	Gaussian curvature	en
dc.title	基於曲率變化之關鍵零件定位物件姿態估計	zh_TW
dc.title	6-DoF Object Pose Estimation via Curvature-based Key-component Localization	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	雷欽隆(Chin-Laung Lei),李宏毅(Hung-Yi Lee),吳育任(Yuh-Renn Wu)
dc.subject.keyword	物件姿態估計,高斯曲率,三維空間,由下而上方法,	zh_TW
dc.subject.keyword	Object pose estimation,Gaussian curvature,Three-dimensional space,Bottom-up method,	en
dc.relation.page	54
dc.identifier.doi	10.6342/NTU202003661
dc.rights.note	有償授權
dc.date.accepted	2020-08-19
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
U0001-1708202002494300.pdf 未授權公開取用	15.84 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。