請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79037
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 傅立成(Li-Chen Fu) | |
dc.contributor.author | Ping-Tsang Wu | en |
dc.contributor.author | 吳秉蒼 | zh_TW |
dc.date.accessioned | 2021-07-11T15:38:38Z | - |
dc.date.available | 2023-08-21 | |
dc.date.copyright | 2018-08-21 | |
dc.date.issued | 2018 | |
dc.date.submitted | 2018-08-13 | |
dc.identifier.citation | [1] Dellaert, F.; Kaess, M. Square root SAM: Simultaneous localization and mapping via square root information smoothing. Int. J. Robot. Res. 2006, 25, 1181–1203.
[2] Kaess, M.; Ranganathan, A.; Dellaert, F. iSAM: Incremental smoothing and mapping. IEEE Trans. Robot. 2008, 24, 1365–1378. [3] Kaess, M.; Johannsson, H.; Roberts, R.; Ila, V.; Leonard, J.J.; Dellaert, F. iSAM2: Incremental smoothing and mapping using the bayes tree. Int. J. Robot. Res. 2012, 31, 217–236. [4] Tang, J.; Chen, Y.; Jaakkola, A.; Liu, J.; Hyyppä, J.; Hyyppä, H. NAVIS—An UGV indoor positioning system using laser scan matching for large-area real-time applications. Sensors 2014, 14, 11805–11824 [5] Grisetti, G.; Stachniss, C.; Burgard, W. Improving grid-based SLAM with Rao-Blackwellized particle filters by adaptive proposals and selective resampling. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Barcelona, Spain, 18–22 April 2005; pp. 2432–2437. [6] Gouveia, B.D.; Portugal, D.; Marques, L. Speeding up Rao-Blackwellized particle filter SLAM with a multithreaded architecture. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Chicago, IL, USA, 14–18 September 2014; pp. 1583–1588. [7] Grisetti, Giorgio, Cyrill Stachniss, and Wolfram Burgard. 'Improved techniques for grid mapping with rao-blackwellized particle filters.' IEEE transactions on Robotics 23.1 (2007): 34-46. [8] Davison, A.J.; Reid, I.D.; Molton, N.D.; Stasse, O. MonoSLAM: Real-time single camera SLAM. IEEE Trans. Pattern Anal. 2007, 29, 1052–1067. [9] Strasdat, H.; Montiel, J.; Davison, A.J. Scale drift-aware large scale monocular SLAM. In Proceedings of the Robotics: Science and Systems (RSS), Zaragoza, Spain, 27–30 June 2010. [10] Ramos, F.T.; Fox, D.; Durrant-Whyte, H.F. CRF-matching: Conditional random fields for feature-based scan matching. In Proceedings of the Robotics: Science and Systems (RSS), Atlanta, GA, USA, 27–30 June 2007. [11] G. Klein and D. W. Murray. Parallel tracking and mapping for small AR workspaces. In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2007. 1, 2.2 [12] J. Engel, T. Schops, and D. Cremers, “LSD-SLAM: Large-scale direct ¨ monocular SLAM,” in Proc. Eur. Conf. Comput. Vision, Zurich, Switzerland, Sep. 2014, pp. 834–849. [13] C. Forster, M. Pizzoli, and D. Scaramuzza, “SVO: Fast semi-direct monocular visual odometry,” in Proc. IEEE Int. Conf. Robot. Autom., Hong Kong,Jun. 2014, pp. 15–22. [14] R. Mur-Artal, J. Montiel, and J. Tardos, “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Robot., vol. 31, no. 5, pp. 1147–1163, Oct. 2015. [15] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” in Proc. IEEE Int. Conf. Comput. Vision, Barcelona, Spain, Nov. 2011, pp. 2564–2571. [16] May, S.; Droeschel, D.; Holz, D.; Fuchs, S.; Malis, E.; Nüchter, A.; Hertzberg, J. Three-dimensional mapping with time-of-flight cameras. J. Field Robot. 2009, 26, 934–965. [17] Zhang, X.; Rad, A.B.; Wong, Y.K. Sensor fusion of monocular cameras and laser rangefinders for line-based simultaneous localization and mapping (SLAM) tasks in autonomous mobile robots. Sensors 2012, 12, 429–452 [18] Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 2012, 31, 647–663. [19] Lee, D.; Kim, H.; Myung, H. GPU-based real-time RGB-D 3D SLAM. In Proceedings of the International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), Daejeon, Korea, 26–29 November 2012; pp. 46–48. [20] Lee, D.; Kim, H.; Myung, H. 2D image feature-based real-time RGB-D 3D SLAM. In Proceedings of the Robot Intelligence Technology and Applications (RiTA), Gwangju, Korea, 16–18 December 2012; pp. 485–492. [21] Lee, D.; Myung, H. Solution to the SLAM problem in low dynamic environments using a pose graph and an RGB-D sensor. Sensors 2014, 14, 12467–12496. [22] Segura, M.J.; Auat Cheein, F.A.; Toibero, J.M.; Mut, V.; Carelli, R. Ultra wide-band localization and SLAM: A comparative study for mobile robot navigation. Sensors 2011, 11, 2035–2055 [23] He, B.; Zhang, S.; Yan, T.; Zhang, T.; Liang, Y.; Zhang, H. A novel combined SLAM based on RBPF-SLAM and EIF-SLAM for mobile system sensing in a large scale environment. Sensors 2011, 11, 10197–10219 [24] Shi, Y.; Ji, S.; Shi, Z.; Duan, Y.; Shibasaki, R. GPS-supported visual SLAM with a rigorous sensor model for a panoramic camera in outdoor environments. Sensors 2012, 12, 119–136. [25] Munguía, R.; Castillo-Toledo, B.; Grau, A. A robust approach for a filter-based monocular simultaneous localization and mapping (SLAM) system. Sensors 2013, 13, 8501–8522 [26] Guerra, E.; Munguia, R.; Grau, A. Monocular SLAM for autonomous robots with enhanced features initialization. Sensors 2014, 14, 6317–6337. [27] Lee, S.M.; Jung, J.; Kim, S.; Kim, I.J.; Myung, H. DV-SLAM (dual-sensor-based vector-field SLAM) and observability analysis. IEEE Trans. Ind. Electron. 2014, 62, 1101–1112 [28] Jung, J.; Oh, T.; Myung, H. Magnetic field constraints and sequence-based matching for indoor pose graph SLAM. Robot Auton. Syst. 2015, 70, 92–105. [29] E.H. Andelson and C.H. Anderson and J.R. Bergen and P.J. Burt and J.M. Ogden. 'Pyramid methods in image processing'. 1984. [30] H. Christopher Longuet-Higgins (September 1981). A computer algorithm for reconstructing a scene from two projections. Nature. 293 (5828): 133–135. [31] Richard Hartley and Andrew Zisserman (2003). Multiple View Geometry in computer vision. Cambridge University Press. [32] R. Mur-Artal; J. D. Tardos; ORB-SLAM2: an Open-Source SLAM system for monocular stereo and RGB-D cameras, 2016. [33] J. J. Gibson (1966). The Senses Considered as Perceptual Systems. Allen and Unwin, London. [34] J.J. Gibson, The Ecological Approach to Visual Perception, Houghton-Mifflin, Boston, MA (1979) [35] S. Vasudevan; S. Gachter; M. Berger; and R. Siegwart, Cognitive maps for mobile robots an object based approach, IROS 2006. [36] Greene M. R.; Baldassano C.; Esteva A. Affordances provide a fundamental categorization principle for visual scenes (2014). https://arxiv.org/ftp/arxiv/papers/1411/1411.5340.pdf [37] Kjellström H, Romero J and Kragic D, Visual object action recognition: Inferring object affordances from human demonstration. Computer Vision and Image Understanding, 2011 115: 81–90. [38] Koppula, Hema S and Saxena, Ashutosh, Anticipating human activities using object affordances for reactive robotic response. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 38(1):14–29, 2016. [39] Lasitha Piyathilaka and Sarath Kodagoda. Active visual object search using affordance-map in real world: A human-centric approach. In Control, Automation, Robotics and Vision, ICARCV, 2014. [40] L. Piyathilaka and S. Kodagoda. Affordance-map: Mapping human context in 3d scenes using cost-sensitive svm and virtual human models. In 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO), pages 2035–2040, 2015. [41] H. S. Koppula, R. Gupta, A. Saxena, Learning human activities and object affordances from RGB-D videos, Int. J. Robot. Res., vol. 32, no. 8, pp. 951-970, 2013. [42] Dutta, V., Zielinska, T.: Action prediction based on physically grounded object affordances in human-object interactions. In: 11th Int. Workshop on Robot Motion and Control (RoMoCo), pp. 47–52, 2017. [43] A. Pieropan, C. H. Ek, and H. Kjellstrom. Functional object descriptors for human activity modeling. In IEEE International Conference on Robotics and Automation (ICRA), pages 1282–1289, 2013. [44] Ping-Tsang Wu, Shao-Hung Chan, Li-Chen Fu, Robust 2D Indoor Localization Using Laser SLAM and Visual SLAM Fusion, IEEE International Conference on System, Men, and Cybernetics (SMC) 2018. [45] https://github.com/ros-planning/navigation [46] Quigley, Morgan, et al. 'ROS: an open-source Robot Operating System.' ICRA workshop on open source software. Vol. 3. No. 3.2. 2009. [47] C. Mei, G. Sibley, P. Newman, 'Closing loops without places', Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., pp. 3738-3744, Oct. 2010. [48] Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. In: CVPR (2016) [49] J. Redmon and A. Farhadi. YOLO9000: Better, faster, stronger. arXiv preprint, arXiv:1612.08242, 2016. [50] J. Redmon and A. Farhadi, YOLOv3: An incremental improvement, arXiv preprint arXiv:1804.02767, 2018. [51] YOLO official website, https://pjreddie.com/darknet/yolo/. [52] R. Girshick. Fast R-CNN. In ICCV, 2015 [53] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS, 2015. [54] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh. Realtime multiperson 2d pose estimation using part affinity fields. In CVPR, 2017 [55] Tomas Simon, Hanbyul Joo, Iain Matthews, and Yaser Sheikh. Hand Keypoint Detection in Single Images using Multiview Bootstrapping. In CVPR, 2017. [56] Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. Convolutional pose machines. In CVPR, 2016. [57] Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele. 2D Human Pose Estimation: New Benchmark and State of the Art Analysis. In CVPR, 2014. [58] https://github.com/CMU-Perceptual-Computing-Lab/openpose [59] B. Alkire, Character animation and gesture-based interaction in an immersive virtual environment using skeletal tracking with the microsoft kinect, September 2011. [60] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, A. Ng, ROS: An open-source robot operating system, Proc. Open-Source Software Workshop Int. Conf. Robotics and Automation, 2009. [61] Hart, Peter E., Nils J. Nilsson, and Bertram Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics 4.2 (1968) [62] Thrun, D. Fox W. Burgard S., D. Fox, and W. Burgard. 'The dynamic window approach to collision avoidance.' IEEE Transactions on Robotics and Automation 4 (1997). [63] A. Quattoni, and A.Torralba. Recognizing Indoor Scenes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009. [64] Fusiello, A. Elements of Geometric Computer Vision April 2008, http://www.sci.univr.it/~fusiello [65]https://www.tootsweet4two.com/42-things-in-your-living-room-family-room-andor-great-room/ [66] https://www.tootsweet4two.com/42-kitchen-basics-for-your-new-home/ [67] https://www.tootsweet4two.com/42-things-in-your-master-bedroom/ [68]https://www.forbes.com/sites/wirecutter/2016/03/30/the-best-home-office-furniture-and-supplies/#1a5d260466ef [69] Kolmogorov, Andrey (1956), Foundations of the Theory of Probability, Chelsea [70] M. Vieira, D. R. Faria, U. Nunes, 'Real-time application for monitoring human daily activities and risk situations in robot-assisted living', Robot'15: 2nd Iberian Robotics Conf., 2015. [71] D. R. Faria, C. Premebida, U. Nunes, 'A probabilistic approach for human everyday activities using body motion from RGB-D images“ RO-MAN 2014: the 23rd IEEE International Symposium Robot and Interactive Communication (Edinburgh: IEEE). [72] L. Liao, D. Fox, H. Kautz, 'Location-based activity recognition using relational Markov networks', Proc. of the International Joint Conference on Artificial Intelligence (IJCAI), 2005 [73] H. Koppula, A. Saxena. 'Anticipating human activities using object affordances for reactive robotic response '. IEEE PAMI, 2016. [74] M. Z. Uddin, N. D. Thang, J.T. Kim and T.S. Kim, 'Human Activity Recognition Using Body Joint-Angle Features and Hidden Markov Model'. ETRI Journal, vol.33, no.4, Aug. 2011, pp.569-579. [75] https://www.mindtools.com/pages/article/Body_Language.htm [76] Charles Rich , Brett Ponsleur , Aaron Holroyd , Candace L. Sidner, 'Recognizing engagement in human-robot interaction', Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction, March 02-05, 2010, Osaka, Japan [77] M. W. Lee and I. Cohen. Proposal maps driven mcmc for estimating human body pose in static images. In CVPR, volume 2, pages II–334. IEEE, 2004. [78] E. Simo-Serra, A. Ramisa, G. Alenya, C. Torras, and F. Moreno-Noguer. Single Image 3D Human Pose Estimation from Noisy Observations. In CVPR, 2012. [79] J. Valmadre and S. Lucey. Deterministic 3d human pose estimation using rigid structure. In ECCV, pages 467–480. Springer, 2010. [80] X. K. Wei and J. Chai. Modeling 3d human poses from uncalibrated monocular images. In ICCV, pages 1873–1880. IEEE, 2009. [81] C. Wang, Y. Wang, Z. Lin, A. L. Yuille, and W. Gao. Robust estimation of 3d human poses from a single image. In CVPR, 2014. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79037 | - |
dc.description.abstract | 近年來,社交型與服務型機器人的研究逐漸嶄露出其重要性,隨著老年人口的增加,陪伴型與服務型機器人有著龐大的應用潛能。然而這些機器人能夠順利完成任務的首要前提必須是機器人擁有非常強健的定位能力、強大的感知能力以及安全導航的能力;另一方面,為因應實際使用的需求,所有的計算必須講求效率。能夠雙方面都兼顧才能使機器人更邁向實際應用的階段。
本研究提出一個新穎的系統架構,其中分別包含了「定位系統」、「感知系統」、「推論系統」以及「導航系統」。在「定位系統」中,本研究提出了一個創新的方法來結合兩種不同性質的「同時建圖及定位系統」演算法,藉由來自不同性質的兩種演算法可以互補各自的不足以達到機器人強健的定位。在感知方面,近年來的研究有許多利用類神經網路機器學習解決物體辨識、場景辨識、動作識別等等之問題;然而同時使用過多的類神經網路卻是不切實際的作法,其一,類神經網路會使用過多的電腦暫存空間,其二,同時且實時的執行多個類神經網路模型在同一台電腦上並非容易的事情,其三,過度的浪費運算資源將會使系統架構的未來擴展性受限制。因此在本研究中,僅物體辨識與人體姿態感測採用類神經網路的方式做運算,其餘如場景辨識以及動作辨識將會基於所提供的物體「可供性能力」使機器人推論出最佳的結果。在「感知系統」中,機器人會利用類神經網路實現物體辨識的能力,並且結合深度影像確認出該物體在空間中相對於「定位系統」地圖中的空間位置。在「推論系統」中,本研究提出了一個結合人類知識機率模型,此機率模型會依據使用者所提供的「可供性能力」以及此環境的資料來做出最佳的推論,其優點是不需要大量的訓練資料以及訓練時間便能使機器人做出足夠準確的推論,並且推論結果會依照使用者所提供的資料而能有不同表現,此方法能夠體現出依環境客制化的優點。在「導航系統」的部份,本研究則加強改進現有開源程式包之系統架構來達到更穩健且社交友善的導航表現。 | zh_TW |
dc.description.abstract | In recent years, the research of social and service robots has gradually shown its importance. With the increase of the elderly population, companion and service robots have enormous potential applications. The first prerequisite for these robots to successfully complete their tasks must be that the robot has a very strong localization capability, a strong sensing ability, and the ability to navigate safely; on the other hand, in order to meet the needs of practical uses, efficiency of calculations should be emphasized. It is necessary to take both sides into consideration in order to make the robot more practical.
This present thesis proposes a novel system architecture that includes a 'localization subsystem', a 'perception subsystem', an 'inference subsystem' and a 'navigation subsystem'. In the 'localization subsystem', we propose an innovative method to fuse two different kinds of Simultaneous Localization And Mapping (SLAM) algorithms, which can achieve more robust localization ability. As for perception, many researches in recent years have used neural network-based machine learning methods to solve the problems such as object recognition, scene recognition, and action recognition; however, it is unrealistic to use too many neural network methods simultaneously. First, neural network methods will cost too much computer temporary storage space. Second, it is not easy to execute multiple neural network tasks simultaneously and especially in real time on the same computer. Third, excessive waste of computing resources will make the system architecture be limited in the future development. Therefore, in this present thesis, only object recognition and human skeleton detection are calculated using neural network approaches. For other information such as scene recognition and action recognition, we will let the robot infer the best solution based on the “Affordance” of the objects provided by the user. In the 'perception subsystem', the robot uses the neural network method to realize object recognition, which is combined with the depth image to confirm the spatial position of the object in the space related to the map in 'localization subsystem'. In the “inference subsystem”, we propose a probability model. This probability model will make the best inference based on the “Affordance” provided by the user and the data of the environment. The advantage is that the robot is able to make good inferences without a large amount of training data and training time, and the inference results will behave differently according to the information provided by the user. This behavior can reflect the advantages of optimizing with respect to different environments. In the 'navigation subsystem', we have strengthened the system architecture of the existing open source package to achieve a more robust navigation and socially friendly navigation. | en |
dc.description.provenance | Made available in DSpace on 2021-07-11T15:38:38Z (GMT). No. of bitstreams: 1 ntu-107-R05921013-1.pdf: 4523914 bytes, checksum: 5f289e7cdf3db3768ad6899ff8d2a9f0 (MD5) Previous issue date: 2018 | en |
dc.description.tableofcontents | 口試委員審定書 ii
致謝 ii 摘要 iv Abstract v List of Tables xi List of Figures xii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Research Objectives 3 1.3 Contributions 4 1.4 Thesis Overview 5 Chapter 2 Background and Related Works 6 2.1 Simultaneous Localization And Mapping (SLAM) 6 2.1.1 Introduction 6 2.1.2 Laser-based SLAM 7 2.1.3 Monocular camera-based SLAM 8 2.1.4 SLAM fusion 10 2.2 Affordance 11 2.2.1 Introduction of affordance 11 2.2.2 Use Affordance to Improve Scene Recognition 12 2.2.3 Use Affordance to Improve Activity Recognition 13 2.2.4 Summary of Affordance 13 Chapter 3 Robust 2D Localization by Using Laser-based and Monocular Camera-based SLAM Algorithms Fusion 15 3.1 Preliminary 15 3.1.1 Laser-based SLAM algorithm: GMapping and AMCL 15 3.1.2 Monocular camera-based SLAM algorithm: ORB-SLAM2 17 3.2 Proposed SLAM Fusion Architecture Overview 18 3.3 Methodology of the Architecture 20 3.3.1 Trajectory Extraction 20 3.3.2 Aligned Angle Analysis 22 3.3.3 Curvature Filter 23 3.3.4 Pyramid Filter 25 3.3.5 Find Essential Matrix 25 3.3.6 Integrating 26 Chapter 4 Multi-Layer Affordance Map 27 4.1 Preliminary 28 4.1.1 You Only Look Once (YOLO) 29 4.1.2 OpenpPose Human Skeleton Detection 30 4.2 Object Layer 31 4.2.1 Object Segmentation 32 4.2.2 Transformation Matrices from Depth Image to 3D Pointcloud and 2D Object Map 35 4.2.3 Update of Objects’ Locations on 2D Object Map 38 4.3 Scene Layer 39 4.3.1 Objects Region Separation 40 4.3.2 Construct Objects’ Affordance 43 4.3.3 Proposed Probability Model for Scene Inference 44 4.4 Event Layer 48 4.4.1 Image Stitching 49 4.4.2 Human Pose State Detection and Human’s Engagement Level 50 4.4.3 Event Probability 53 Chapter 5 Mobile Robot Indoor Navigation 57 5.1 Preliminary – ROS package: Navigation Stack 57 5.2 Improved Architecture for Navigation 59 Chapter 6 Experimental Results 64 6.1 Experimental Results for SLAM Fusion 64 6.1.1 Experiment: Using Pioneer 3DX 67 6.1.2 Experiment with Pepper: 71 6.1.3 Discussion 72 6.2 Experimental Results for Multi-Layer Affordance Map 73 6.2.1 Experiment: Object Layer 73 6.2.2 Experiment: Scene Layer 76 6.2.3 Experiment: Event Layer 80 6.2.4 Discussion 88 Chapter 7 Conclusion and Future Works 90 Reference 92 | |
dc.language.iso | zh-TW | |
dc.title | 以多層環境可供性地圖達成強健室內定位、人類活動事件偵測以及社交友善導航 | zh_TW |
dc.title | Robust Indoor Localization, Human Event Detection and Socially Friendly Navigation Based on Multi-Layer Environmental Affordance Map | en |
dc.type | Thesis | |
dc.date.schoolyear | 106-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 施吉昇(Chi-Sheng Shih),簡忠漢(Jong-Hann Jean),林沛群(Pei-Chun Lin),王傑智(Chieh-Chih Wang) | |
dc.subject.keyword | 同時建圖及定位系統,人類活動事件偵測,可供性地圖,社交友善導航, | zh_TW |
dc.subject.keyword | Simultaneous Localization And Mapping,Human Event Detection,Affordance Map,Socially Friendly Navigation, | en |
dc.relation.page | 98 | |
dc.identifier.doi | 10.6342/NTU201803236 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2018-08-14 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
dc.date.embargo-lift | 2023-08-21 | - |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-107-R05921013-1.pdf 目前未授權公開取用 | 4.42 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。