基於片狀平面約束與高斯牛頓法進行單眼影像序列之正規半稠密環境重建

Yu-Ting Chen; 陳昱廷

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49860

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	連豊力
dc.contributor.author	Yu-Ting Chen	en
dc.contributor.author	陳昱廷	zh_TW
dc.date.accessioned	2021-06-15T11:53:08Z	-
dc.date.available	2016-08-24
dc.date.copyright	2016-08-24
dc.date.issued	2016
dc.date.submitted	2016-08-11
dc.identifier.citation	[1: Engel et al. 2014] J. Engel, T. Schops, and D. Cremers, “LSD-SLAM:Large-scale Direct Monocular SLAM,” in Proceedings of the European Conference on Computer Vision (ECCV), Zurich, pp. 834-849,Sep. 6-12, 2014. [2: Engel et al. 2013] J. Engel, J. Sturm, and D. Cremers, “Semi-Dense Visual Odometry for a Monocular Camera,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2013. [3: Mur-Artal & Tardos 2015] Raul Mur-Artal and Juan D. Tardos, “Probabilistic Semi-Dense Mapping from Highly Accurate Feature-Based Monocular SLAM,” Robotics: Science and Systems Conference, Rome, Italy, July 2015. [4: Mur-Artal et al. 2015] Raul Mur-Artal, J. M. M. Montiel and Juan D. Tardos, “ORB-SLAM: A Versatile and Accurate Monocular SLAM System,” IEEE Transactions on Robotics, vol. PP, no. 99, pp.1-17, 2015. [5: Forster et al. 2014] C. Forster, M. Pizzoli, D. Scaramuzza, “SVO: Fast semi-direct monocular visual odometry,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp.15-22, May 31, June 2014. [6: Pizzoli et al. 2014] M. Pizzoli, C. Forster, D. Scaramuzza, “REMODE: Probabilistic, monocular dense reconstruction in real time,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 2609-2616, May 3, June 7, 2014. [7: Vogiatzis & Herna ́ndez 2011] G. Vogiatzis and C. Herna ́ndez, “Video-based, real-time multi-view stereo,” Image and Vision Computing, vol. 29, no. 7, pp. 434-441, 2011. [8: Woodford & Vogiatzis 2012] O. Woodford and G. Vogiatzis, “A generative model for online depth fusion,” in Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, pp. 144-157, October 7-13, 2012. [9: Chambolle & Pock 2011] A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,” Journal of Mathematical Imaging and Vision, vol. 40, no. 1, 2011. [10: Concha & Civera 2015] A. Concha and J. Civera, “DPPTAM: Dense piecewise-planar tracking and mapping from a monocular sequence,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, pp. 5686-5693, Sept. 28-Oct. 2, 2015. [11: Concha & Civera 2014] A. Concha and J. Civera, “Using superpixels in monocular SLAM,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), HKSAR, China, pp. 365-372, May 31-June 7, 2014. [12: Felzenszwalb & Huttenlocher 2004] P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International Journal of Computer Vision, vol. 59, no. 2, pp. 167-181, September 2004. [13: Concha et al. 2014] A. Concha, W. Hussain, L. Montano, and J. Civera, “Manhattan and piecewise-planar constraints for dense monocular mapping,” in Proceedings of Robotics: Science and Systems, Berkeley, California, USA, July 12-16, 2014.   [14: Kerl et al. 2013] C. Kerl, J. Sturm and D. Cremers, “Robust odometry estimation for RGB-D cameras,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, pp. 3748-3754, May 6-10, 2013. [15: Stuehmer et al. 2010] J. Stuehmer, S. Gumhold, and D. Cremers “Real-time dense geometry from a handheld camera,” in Proceedings of the DAGM Symposium on Pattern Recognition, 2010. [16: Nister et al. 2004] David Nister, Oleg Naroditsky, and James Bergen, “Visual Odometry,” in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, vol. 1, pp. 652-659, June 27-July 2, 2004. [17: Scaramuzza & Fraundorfer 2011] Davide Scaramuzza and Friedrich Fraundorfer, “Visual Odometry [Tutorial],” IEEE Robotics and Automation Magazine, Vol. 18, No. 4, pp. 80-92, December 2011. [18: Newcombe et al. 2011] R. A. Newcombe, S. J. Lovegrove, and A. J. Davison, “DTAM: Dense tracking and mapping in real-time,” in Proceedings of the International Conference on Computer Vision (ICCV), 2011. [19: Baker & Matthews 2004] S. Baker and I. Matthews, “Lucas-Kanade 20 years on: A unifying framework: Part 1,” International Journal of Computer Vision (IJCV), pp. 221–255, 2004. [20: Klein & Murray 2007] G. Klein and D. W. Murray, “Parallel tracking and mapping for small AR workspaces,” In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2007. [21: Strasdat et al. 2010] H. Strasdat, J. M. M. Montiel, and A. Davison, “Scale drift-aware large scale monocular SLAM,” in Proceedings of Robotics: Science and Systems (RSS), Universidad de Zaragoza, Zaragoza, Spain, June 27 - June 30, 2010. [22: Bailey & Durrant-Whyte 2006] Tim Bailey and Hugh Durrant-Whyte, “Simultaneous Localization and Mapping (SLAM): Part II State of the Art,” IEEE Robotics & Automation Magazine, Vol. 13, No. 3, pp. 108–117, September 2006. [23: Zhang 2000] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1330-1334, Nov. 2000. [24: Lepetit et al. 2009] V. Lepetit, F. Moreno-Noguer and P. Fua, “EPnP: An Accurate O(n) Solution to the PnP Problem,” International Journal of Computer Vision, vol. 81, pp. 155-166, 2009. [25: Blanco 2010] JL. Blanco, “A tutorial on SE(3) transformation parameterizations and on-manifold optimization,” Technical Report, University of Malaga, 2010. [26: Zhang 1997] Z. Zhang, “Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting,” Image and Vision Computing, vol. 15, no. 1, pp. 59-76, January 1997. [27: Rublee et al. 2011] E. Rublee, V. Rabaud, K. Konolige and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” in Proceedings of IEEE International Conference on Computer Vision, Barcelona, pp. 2564-2571, Nov. 6-13, 2011. [28: Fischler & Bolles 1981] Martin A. Fischler and Robert C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, Issue. 6, pp. 381-395, 1981. [29: Lim et al. 2014] H. Lim, J. Lim and H. J. Kim, “Real-time 6-DOF monocular visual SLAM in a large-scale environment,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, pp. 1532-1539, May 31-June 7, 2014. [30: Whelan et al. 2015] T. Whelan, S. Leutenegger, R. S. Moreno, B. Glocker and A. Davison, “Elasticfusion: Dense SLAM without a pose graph,” in Proceedings of Robotics: Science and Systems, Rome, Italy, pp. 1-10, July 13-17, 2015 [31: Stuckler & Behnke 2014] J. Stuckler and S. Behnke, “Multi-resolution surfel maps for efficient dense 3d modeling and tracking,” In Journal of Visual Communication and Image Representation, vol. 25, pp. 137-147, 2014 [32: Whelan et al. 2015] T. Whelan, M. Kaess, H. Johannsson, M. Fallon, J. J. Leonard and J. McDonald, “Real-time large scale dense RGB-D SLAM with volumetric fusion,” International Journal of Robotics Research, vol. 34, no. 4-5, pp. 598-626, April, 2015 [33: Engel et al. 2014] J. Engel, J. Stueckler, and D. Cremers, “Large-scale direct slam with stereo cameras,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, pp. 1935-1942, Sept. 28-Oct. 2, 2015. [34: Kuschk & Cremers 2013] G. Kuschk and D. Cremers, “Fast and accurate large-scale stereo reconstruction using variational methods,” in Proceedings of IEEE International Conference on Computer Vision Workshops (ICCVW), Sydney, NSW, pp. 700-707, Dec. 2-8, 2013 [35: Geiger et al 2011] A. Geiger, J. Ziegler, and C. Stiller, “StereoScan: Dense 3d reconstruction in real-time,” IEEE Intelligent Vehicles Symposium (IV), Baden-Baden, pp. 963-968, June 5-9, 2011 [36: Bentley 1975] Jon Louis Bentley, “Multidimensional binary search trees used for associative searching,” Communications of the ACM, vol.18, no.9, pp.509-517, Sept., 1975. [37: Lowe 2004] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110, 2004. [38: Rostern & Drummond 2006] E. Rosten and T. Drummond, “Machine learning for high-speed corner detection,” in Proceedings of European Conference on Computer Vision (ECCV), Vol. 1, pp. 430–443, 2006. [39: Civera et al. 2008] J. Civera, A. J. Davison and J. M. M. Montiel, “Inverse Depth Parametrization for Monocular SLAM,” IEEE Transactions on Robotics, vol. 24, no. 5, pp. 932-945, Oct., 2008. [40: Kuemmerle et al.2011] R. Kuemmerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, “g2o: A general framework for graph optimization,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, pp. 3607–3613, May, 2011. [41: Weinzaepfel et al. 2013] P. Weinzaepfel, J. Revaud, Z. Harchaoui and C. Schmid, “DeepFlow: Large Displacement Optical Flow with Deep Matching,” in Proceedings of IEEE International Conference on Computer Vision (ICCV), Sydney, NSW, pp. 1385-1392, Dec. 1-8, 2013. Websites: [42: Point Cloud Library from PCL Website 2016] Point Cloud Library. (2016, July 8). In PCL Website. Retrieved July 8, 2016, from http://pointclouds.org/ [43: OpenCV from OpenCV official website 2016] Open Source Computer Vision Library. (2016, July 8). In OpenCV official Website. Retrieved July 8, 2016, from http://opencv.org/ [44: Ethan Eade 2013] Lie groups for 2D and 3D Transformations, 2013, PDF file on http://ethaneade.com Book: [45: Szeliski 2011] R. Szeliski, Computer Vision: Algorithms and Applications, Springer-Verlag, London, 2011. [46: Hartley & Zisserman 2004] R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, second edition, 2004.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/49860	-
dc.description.abstract	近幾年來，利用單眼攝影機進行三維環境的重建已成為一個熱門且具有挑戰性的題目。這項技術可以應用在無人載具進行自動導航、環境探勘以及自動避障，除此之外，還可以應用在擴增實境上。由於相機並不配有慣性測量單元，定位相機與三維環境的重建必須同時進行，在此篇論文中，相機的姿態估測使用以影像上特徵點為基礎的方式[24: Lepetit et al. 2009] 以及直接利用影像上像素的方式 [1: Engel et al. 2014]，相機姿態估測需要使用到重建出的半稠密深度圖，而這些點都是影像上梯度較高的點且很容易就變得很雜亂。因此，本篇論文提出了一個可正規化重建出的半稠密地圖而不影響相機估測精確度的方法，該正規化方法可以去除雜訊並且平滑重建出的地圖，除此之外，此正規化方法與兩張影像上的像素亮度值有關，不同於其他方法只基於重建點之深度值以及空間上的關係。地圖的重建可以分成三個部分: 立體匹配法、片狀平面約束以及平面最佳化。由於高梯度的影像部分通常都很狹窄，不容易通過片狀平面的約束，因此提出了一個利用重建高梯度區域周圍低梯度點來增寬高梯度區域的匹配方式。當半稠密的地圖重建完後，片狀平面約束會估測半稠密地圖上片狀的平面，最後透最佳化所有半稠密地圖上的片狀平面。在本篇論文中，提出的立體匹配法ORB[27: Rublee et al. 2011] 特徵點的預設深度資訊、K維樹 [36: Bentley 1975] 、優先序列以及方向梯度直方圖的亂度構成，目標是透過極平面來正確匹配在兩張影像中圍繞在高梯度點附近的低梯度點，由於低梯度點並沒有紋理上的特徵，很難在兩張影像中做匹配，所以搜尋週圍最適合並且有紋理的區域來做匹配的流程，首先若這個像素並沒有逆深度的假設模型，在此像素點週圍具有深度知訊的ORB 特徵點會用來初始化該像素點的深度值，如此可以縮短極線上的搜尋長度以及增加匹配的精準度，搜尋高梯度紋理區域是利用K維樹進行K最近鄰演算法，並將搜尋到的點依照梯度最大優先的條件存入優先序列中，若搜尋到的點通過立體匹配約束，則這個高梯度點會形成一個5×5 像素大小的模板，然後用來進去立體匹配，當兩張影像上的模板的相差通過隨著梯度直方圖的亂度而改變的閥值時，相對應的點便視為匹配成功。本篇論文正規化的部分，假設投影到三維空間中的每一小片狀點雲都符合一個平面，每個片狀點雲的大小為5×5 像素，由於這個假設在兩個物體的交接處或不連續的區域並不會成立，此時片狀平面約束便使用來篩選掉非平面的區域，通過片狀平面約束後，利用高斯牛頓法最小化兩張影像上投影自片狀平面區塊的差值以得到最佳化的平面參數，之後，最佳化平面參數用來消除點雲中的雜點以及讓點雲平滑化。實驗結果顯示所提出的正規化演算法可以消除大部分的雜點，以及重建出一個更加清楚的環境。	zh_TW
dc.description.abstract	Three-dimensional environment reconstruction from a monocular camera has been a popular and a challenge research topic in past few years. This technique can be applied to unmanned vehicles to perform automatic navigation, environment exploration and automatic obstacle avoidance. In addition, it can also be applied to augmented reality. Since the camera is not equipped with an inertial measurement unit (IMU), it is necessary to locate the camera position and map the environment simultaneously. In this thesis, the camera pose estimation is based on feature based method [24: Lepetit et al. 2009] and direct method [1: Engel et al. 2014]. The camera localization thread is depend on the semi-dense map which is the high gradient area in image and is easily to become noisy. Hence, a method that can regularize the reconstructed semi-dense map without affect the accuracy of the camera pose localization is proposed in this thesis. The regularization method can eliminate the noise and smooth the semi-dense map. Furthermore, the regularization method is related to the photometric information between two images, unlike other methods only using the information of the depth and spatial relation. The reconstruction algorithm can be divide into three parts: stereo matching, piecewise planar constraint, and plane optimization. Since the high gradient areas are always narrow and hard to apply the piecewise planar constraint, a stereo matching method that can broaden the high gradient area by using their nearby low gradient pixels is proposed. After the semi-dense map is reconstructed, the semi-dense map will propagate to the piecewise planar constraint which can estimate the initial piecewise planes for each pixel. Finally, the optimization method is applied to optimize each estimated piecewise plane. In this thesis, the proposed stereo matching is composed of prior depth of ORB feature [27: Rublee et al. 2011], KD-Tree [36: Bentley 1975], Priority Queue and the entropy of the histogram of oriented gradient. The aim is to match the low gradient area around the high gradient area between two images correctly by using the epipolar geometry. It is hard to match two textureless areas between two images, so the best nearby texture area is searched to do the matching procedure. Firstly, if one pixel does not hold an inverse depth hypothesis, the nearby ORB features which has initial depth knowledge is used to initiate the inverse depth value, which can shorter the epipolar line searching length and improve the accuracy of the matching result. Searching the texture area which contains high gradient pixel is done by using k nearest neighbor search with KD-Tree, and sorting the searched pixels in accordance with the gradient magnitude by the priority queue. If the searched point passes the stereo searching constraint, the searched high gradient point will form a 5×5 pixels template and be used to do the stereo line searching. The corresponding points are considered to be matched if the residual between the templates in two image pass the stereo matching threshold which will change with the value of the searching region’s entropy of the histogram of oriented gradient. In the regularization part of this thesis, each tiny piece of point cloud projected from the image in 3D coordinate is assumed to fit a plane. The corresponding size in the image of each piece is set to 5×5 pixels. Since the assumption will not hold if the piece is in the border between two different objects or the discontinuous area, the planar constraint is applied to discriminate the non-planar region. After passing the planar constraint, Gauss-Newton method is used to minimize the photometric error between the two patches which projected from the piece in 3D coordinate in two images and the optimal parameters of the plane can be obtained. Afterwards, the optimal parameters are used to eliminate the noises and smooth the point cloud. The experimental results demonstrate that the proposed regularization algorithm can eliminate most of the noises and reconstruct a more clearly point cloud.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T11:53:08Z (GMT). No. of bitstreams: 1 ntu-105-R03921066-1.pdf: 14295722 bytes, checksum: a4325bb5e166c761841178290ca22499 (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 iii 摘要 v ABSTRACT vii CONTENTS xi LIST OF FIGURES xiii LIST OF TABLES xix Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Problem Formulation 3 1.3 Contributions 5 1.4 Organization of the Thesis 6 Chapter 2 Background and Literature Survey 8 2.1 Camera Pose Estimation 8 2.2 Map Reconstruction 10 Chapter 3 Related Algorithms 14 3.1 Camera Pin-hole Model 14 3.2 Lie Group SE(3), Rigid transformations 18 3.3 Epipolar Geometry 23 3.4 Nonlinear least squares and Gauss Newton Method 30 3.5 Random Sample Consensus 36 Chapter 4 Regularized Semi-dense Map Reconstruction and Camera Pose Estimation 39 4.1 Initialization of the System 41 4.1.1 Feature extraction and Feature matching 41 4.1.2 Estimation of Fundamental matrix 43 4.1.3 Camera Pose Reconstruction 46 4.2 Epipolar Line Searching 51 4.2.1 Epipolar Line Searching Range 51 4.2.2 Matching Method 54 4.2.3 Inverse Depth hypothesis [2: Engel et al. 2013] 68 4.3 Camera Pose Estimation 76 4.3.1 EPnP [24: Lepetit et al. 2009] 76 4.3.2 Direct Method on SE(3) [1: Engel et al. 2014] 77 4.3.3 Direct Method on Sim(3) [1: Engel et al. 2014] 86 4.4 Depth Map Regularization 94 4.4.1 Piecewise Planar Constraint and Plane Calculation 95 4.4.2 Plane Optimization 101 4.4.3 Regularization Algorithm 105 4.5 Keyframe Creation 113 Chapter 5 Experimental Results and Analysis 116 5.1 Hardware and Camera intrinsic parameter 117 5.2 Simulation of camera pose estimation on SE(3) 118 5.2.1 Simulation Data Set 118 5.2.2 Accuracy of Camera Pose Estimation 120 5.3 Simulation of keyframe creation 136 5.3.1 Simulation Data Set 136 5.3.2 Simulation of new keyframe creation 138 5.4 Simulation of camera pose estimation using Direct method on Sim(3) 148 5.4.1 Simulation Data set 148 5.4.2 Comparison of direct method on SE(3) and direct method on Sim(3) 151 5.4.3 Threshold of keyframe creation 155 5.5 Analysis of Depth Map and Regularization Algorithm 164 5.5.1 Experiment Data Set 164 5.5.2 Analysis of the depth map 167 5.5.3 Analysis of the camera pose estimation 173 5.6 Real Scene Result 180 5.6.1 Corridor Scene 182 5.6.2 Hall Scene 198 5.6.3 Indoor Scene 213 5.6.4 Outdoor Scene 234 5.6.5 Comparison of the real scenes 249 Chapter 6 Conclusions and Future Works 253 6.1 Conclusions 253 6.2 Future Works 255 REFERENCES 256
dc.language.iso	en
dc.subject	正規化演算法	zh_TW
dc.subject	單眼視覺同步定位與地圖構建	zh_TW
dc.subject	半稠密環境重建	zh_TW
dc.subject	相機姿態估測	zh_TW
dc.subject	非線性最佳化	zh_TW
dc.subject	semi-dense map reconstruction	en
dc.subject	regularization algorithm	en
dc.subject	nonlinear optimization	en
dc.subject	camera pose estimation	en
dc.subject	Monocular simultaneous localization and mapping	en
dc.title	基於片狀平面約束與高斯牛頓法進行單眼影像序列之正規半稠密環境重建	zh_TW
dc.title	Regularized Semi-Dense Map Reconstruction from a Monocular Sequence based on Piecewise Planar Constraint and Gauss Newton Method	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	簡忠漢,李後燦
dc.subject.keyword	單眼視覺同步定位與地圖構建,半稠密環境重建,相機姿態估測,非線性最佳化,正規化演算法,	zh_TW
dc.subject.keyword	Monocular simultaneous localization and mapping,semi-dense map reconstruction,camera pose estimation,nonlinear optimization,regularization algorithm,	en
dc.relation.page	260
dc.identifier.doi	10.6342/NTU201602272
dc.rights.note	有償授權
dc.date.accepted	2016-08-11
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	13.96 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。