室內自動導航無人機系統之同步定位、地圖構建與影像物件偵測

CHENG-WEI HUANG; 黃政維

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86251

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳俊杉(Chuin-Shan Chen)
dc.contributor.author	CHENG-WEI HUANG	en
dc.contributor.author	黃政維	zh_TW
dc.date.accessioned	2023-03-19T23:44:51Z	-
dc.date.copyright	2022-08-31
dc.date.issued	2022
dc.date.submitted	2022-08-30
dc.identifier.citation	References [1] Jaihyun Lee. Optimization of a modular drone delivery system. In 2017 annual IEEE international systems conference (SysCon), pages 1–8. IEEE, 2017. [2] Wongi S Na and Jongdae Baek. Impedance-based non-destructive testing method combined with unmanned aerial vehicle for structural health monitoring of civil infrastructures. Applied Sciences, 7(1):15, 2016. [3] MorganQuigley,KenConley,BrianGerkey,JoshFaust,TullyFoote,JeremyLeibs, Rob Wheeler, Andrew Y Ng, et al. Ros: an open-source robot operating system. In ICRA workshop on open source software, volume 3, page 5. Kobe, Japan, 2009. [4] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [5] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961– 2969, 2017. [6] Ron Mokady, Amir Hertz, and Amit H Bermano. Clipcap: Clip prefix for image captioning. arXiv preprint arXiv:2111.09734, 2021. [7] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In European conference on computer vision, pages 213–229. Springer, 2020. [8] Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5828–5839, 2017. [9] Mohamed Elbanhawi, Abdulghani Mohamed, Reece Clothier, Jennifer L Palmer, Milan Simic, and Simon Watkins. Enabling technologies for autonomous mav op- erations. Progress in Aerospace Sciences, 91:27–52, 2017. [10] IgnacioParra,MiguelÁngelSotelo,DavidF.Llorca,C.Fernández,A.Llamazares, N. Hernández, and I. García. Visual odometry and map fusion for gps navigation assistance. In 2011 IEEE International Symposium on Industrial Electronics, pages 832–837, 2011. [11] Lijun Wei, Cindy Cappelle, Yassine Ruichek, and Frédérick Zann. Intelligent ve- hicle localization in urban environments using ekf-based visual odometry and gps fusion. IFAC Proceedings Volumes, 44(1):13776–13781, 2011. [12] Sofia Yousuf and Muhammad Bilal Kadri. Sensor fusion of ins, odometer and gps for robot localization. In 2016 IEEE Conference on Systems, Process and Control (ICSPC), pages 118–123. IEEE, 2016. [13] S Ahirwar, R Swarnkar, S Bhukya, and G Namwade. Application of drone in agriculture. International Journal of Current Microbiology and Applied Sciences, 8(1):2500–2505, 2019. [14] MA Dinesh, S Santhosh Kumar, J Sanath, KN Akarsh, and KM Manoj Gowda. Development of an autonomous drone for surveillance application. Proc. Int. Res. J. Eng. Technol. IRJET, 5:331–333, 2018. [15] Eitan Frachtenberg. Practical drone delivery. Computer, 52(12):53–57, 2019. [16] Gianmarco Baldini, Stan Karanasios, David Allen, and Fabrizio Vergari. Survey of wireless communication technologies for public safety. IEEE Communications Surveys Tutorials, 16(2):619–641, 2014. [17] Vijay Sarihan, Jian Wen, G. Li, T. Koschmieder, R. Hooper, Russell Shumway, and J. Macdonald. Designing small footprint, low-cost, high-reliability packages for performance sensitive mems sensors. 2008 58th Electronic Components and Technology Conference, pages 817–818, 2008. [18] JavierGozalvez.Samsungelectronicssets5gspeedrecordat7.5gbs[mobileradio]. IEEE Vehicular Technology Magazine, 10(1):12–16, 2015. [19] Tommy KK Tsang and Mourad N El-Gamal. Ultra-wideband (uwb) communica- tions systems: an overview. In The 3rd International IEEE-NEWCAS Conference, 2005., pages 381–386. IEEE, 2005. [20] Francesco Betti Sorbelli and Cristina Maria Pinotti. On the localization of sensors using a drone with uwb antennas. In RSFF, pages 18–29, 2018. [21] Francesco Betti Sorbelli and Cristina M Pinotti. Ground localization with a drone and uwb antennas: Experiments on the field. In 2019 IEEE 20th International Symposium on” A World of Wireless, Mobile and Multimedia Networks”(WoWMoM), pages 1–7. IEEE, 2019. [22] NicolaMacoir,JanBauwens,BartJooris,BenVanHerbruggen,JenRossey,Jeroen Hoebeke, and Eli De Poorter. Uwb localization with battery-powered wireless backbone for drone-based inventory management. Sensors, 19(3):467, 2019. [23] Shunkai Sun, Jianping Hu, Jie Li, Ruidong Liu, Meng Shu, and Yang Yang. An ins-uwb based collision avoidance system for agv. Algorithms, 12(2):40, 2019. [24] FaheemKhan,Stx00E9;phaneAzou,RouaYoussef,PascalMorel,EmanuelRadoi, and Octavia A. Dobre. An ir-uwb multi-sensor approach for collision avoidance in indoor environments. IEEE Transactions on Instrumentation and Measurement, 71:1–13, 2022. [25] Aarti Singh and Neal Patwari. Collision prediction using uwb and inertial sensing: Experimental evaluation. In 2021 IEEE International Conference on Autonomous Systems (ICAS), pages 1–5. IEEE, 2021. [26] RongZhuandZhaoyingZhou.Areal-timearticulatedhumanmotiontrackingusing tri-axis inertial/magnetic sensors package. IEEE Transactions on Neural systems and rehabilitation engineering, 12(2):295–302, 2004. [27] Norhafizan Ahmad, Raja Ariffin Raja Ghazilla, Nazirah M Khairi, and Vi- jayabaskar Kasi. Reviews on various inertial measurement unit (imu) sensor appli- cations. International Journal of Signal Processing Systems, 1(2):256–262, 2013. [28] D. Nister, O. Naroditsky, and J. Bergen. Visual odometry. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 1, pages I–I, 2004. [29] Ji Zhang and Sanjiv Singh. Visual-lidar odometry and mapping: Low-drift, ro- bust, and fast. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 2174–2181. IEEE, 2015. [30] Ivan Dryanovski, Roberto G Valenti, and Jizhong Xiao. Fast visual odometry and mapping from rgb-d data. In 2013 IEEE international conference on robotics and automation, pages 2305–2310. IEEE, 2013. [31] Martin Dimitrievski, Peter Veelaert, and Wilfried Philips. Behavioral pedestrian tracking using a camera and lidar sensors on a moving vehicle. Sensors, 19(2):391, 2019. [32] Hendrik Weigel, Philipp Lindner, and Gerd Wanielik. Vehicle tracking with lane assignment by camera and lidar sensor fusion. In 2009 IEEE Intelligent Vehicles Symposium, pages 513–520. IEEE, 2009. [33] Surya Kollazhi Manghat and Mohamed El-Sharkawy. A multi sensor real- time tracking with lidar and camera. In 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), pages 0668–0672. IEEE, 2020. [34] Shaojie Shen, Nathan Michael, and Vijay Kumar. Autonomous multi-floor indoor navigation with a computationally constrained mav. In 2011 IEEE International Conference on Robotics and Automation, pages 20–25, 2011. [35] Yingcai Bi, Menglu Lan, Jiaxin Li, Shupeng Lai, and Ben M Chen. A lightweight autonomous mav for indoor search and rescue. Asian Journal of Control, 21(4):1732–1744, 2019. [36] AbrahamGaltonBachrach.Autonomousflightinunstructuredandunknownindoor environments. PhD thesis, Massachusetts Institute of Technology, 2009. [37] Shubhani Aggarwal and Neeraj Kumar. Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges. Computer Communications, 149:270–299, 2020. [38] Scott A Bortoff. Path planning for uavs. In Proceedings of the 2000 american control conference. ACC (IEEE Cat. No. 00CH36334), volume 1, pages 364–368. IEEE, 2000. [39] Omar Souissi, Rabie Benatitallah, David Duvivier, AbedlHakim Artiba, Nicolas Belanger, and Pierre Feyzeau. Path planning: A 2013 survey. In Proceedings of 2013 International Conference on Industrial Engineering and Systems Management (IESM), pages 1–8. IEEE, 2013. [40] Steven M LaValle et al. Rapidly-exploring random trees: A new tool for path planning. 1998. [41] Mei Li, Huai-Ning Wu, and Zhou-Yang Liu. Sampling-based path planning and model predictive image-based visual servoing for quadrotor uavs. In 2017 Chinese Automation Congress (CAC), pages 6237–6242. IEEE, 2017. [42] Ellips Masehian and Davoud Sedighizadeh. Multi-objective robot motion plan- ning using a particle swarm optimization model. Journal of Zhejiang University SCIENCE C, 11(8):607–619, 2010. [43] Peter E Hart, Nils J Nilsson, and Bertram Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics, 4(2):100–107, 1968. [44] Lydia E Kavraki, Petr Svestka, J-C Latombe, and Mark H Overmars. Probabilis- tic roadmaps for path planning in high-dimensional configuration spaces. IEEE transactions on Robotics and Automation, 12(4):566–580, 1996. [45] So-Yeon Park, Christina Suyong Shin, Dahee Jeong, and HyungJune Lee. Dronenetx: Network reconstruction through connectivity probing and relay de- ployment by multiple uavs in ad hoc networks. IEEE Transactions on Vehicular Technology, 67(11):11192–11207, 2018. [46] Hrishikesh Sharma, Tom Sebastian, and P Balamuralidhar. An efficient backtracking-based approach to turn-constrained path planning for aerial mobile robots. In 2017 European Conference on Mobile Robots (ECMR), pages 1–8. IEEE, 2017. [47] DarongHuang,DongZhao,andLingZhao.Anewmethodoftheshortestpathplan- ning for unmanned aerial vehicles. In 2017 6th Data Driven Control and Learning Systems (DDCLS), pages 599–605. IEEE, 2017. [48] Moustafa M Kurdi, Alex K Dadykin, Imad Elzein, and Ibrahim Sayed Ahmad. Proposed system of artificial neural network for positioning and navigation of uav- ugv. In 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting (EBBT), pages 1–6. IEEE, 2018. [49] Dong Ki Kim and Tsuhan Chen. Deep neural network for real-time autonomous indoor navigation. ArXiv, abs/1511.04668, 2015. [50] Scott Gebhardt, Eliezer Payzer, Leo Salemann, Alan Fettinger, Eduard Roten- berg, and Christopher Seher. Polygons, point-clouds and voxels: A comparison of high-fidelity terrain representations. In Simulation Interoperability Workshop and Special Workshop on Reuse of Environmental Data for Simulation—Processes, Standards, and Lessons Learned, page 9, 2009. [51] Helen Couclelis. Ontology, epistemology, teleology: triangulating geographic in- formation science. In Research trends in geographic information science, pages 3–15. Springer, 2009. [52] Yanbin Liu and Yuanyuan Jiang. Robotic path planning based on a triangular mesh map. International Journal of Control, Automation and Systems, 18(10):2658– 2666, 2020. [53] Pileun Kim, Jingdao Chen, and Yong K Cho. Automated point cloud registration using visual and planar features for construction environments. J. Comput. Civ. Eng, 32(2):04017076, 2018. [54] Siheng Chen, Baoan Liu, Chen Feng, Carlos Vallespi-Gonzalez, and Carl Welling- ton. 3d point cloud processing and learning for autonomous driving: Impacting map creation, localization, and perception. IEEE Signal Processing Magazine, 38(1):68–86, 2020. [55] Timo Hackel, Nikolay Savinov, Lubor Ladicky, Jan D Wegner, Konrad Schindler, and Marc Pollefeys. Semantic3d. net: A new large-scale point cloud classification benchmark. arXiv preprint arXiv:1704.03847, 2017. [56] NiloyJMitraandAnNguyen.Estimatingsurfacenormalsinnoisypointclouddata. In Proceedings of the nineteenth annual symposium on Computational geometry, pages 322–328, 2003. [57] Martin Caon. Voxel-based computational models of real human anatomy: a review. Radiation and environmental biophysics, 42(4):229–235, 2004. [58] Derek LG Hill, Colin Studholme, and David John Hawkes. Voxel similarity mea- sures for automated image registration. In Visualization in Biomedical Computing 1994, volume 2359, pages 205–216. SPIE, 1994. [59] André Collignon, Dirk Vandermeulen, Paul Suetens, and Guy Marchal. 3d multi- modality medical image registration using feature space clustering. In International Conference on Computer Vision, Virtual Reality, and Robotics in Medicine, pages 195–204. Springer, 1995. [60] Alberto Elfes and Larry Matthies. Sensor integration for robot navigation: Com- bining sonar and stereo range data in a grid-based representataion. In 26th IEEE conference on decision and control, volume 26, pages 1802–1807. IEEE, 1987. [61] Kai M Wurm, Armin Hornung, Maren Bennewitz, Cyrill Stachniss, and Wolfram Burgard. Octomap: A probabilistic, flexible, and compact 3d map representation for robotic systems. In Proc. of the ICRA 2010 workshop on best practice in 3D perception and modeling for mobile manipulation, volume 2, 2010. [62] Brian Curless and Marc Levoy. A volumetric method for building complex models from range images. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pages 303–312, 1996. [63] Adrian Hilton, Andrew J Stoddart, John Illingworth, and Terry Windeatt. Reliable surface reconstruction from multiple range images. In European conference on computer vision, pages 117–126. Springer, 1996. [64] Nicola Fioraio, Jonathan Taylor, Andrew Fitzgibbon, Luigi Di Stefano, and Shahram Izadi. Large-scale and drift-free surface reconstruction using online sub- volume registration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4475–4483, 2015. [65] Simon Fuhrmann and Michael Goesele. Floating scale surface reconstruction. ACM Transactions on Graphics (ToG), 33(4):1–11, 2014. [66] Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard New- combe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, et al. Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th annual ACM symposium on User interface software and technology, pages 559–568, 2011. [67] Richard A Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, and An- drew Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In 2011 10th IEEE international symposium on mixed and augmented reality, pages 127–136. Ieee, 2011. [68] HughDurrant-WhyteandTimBailey.Simultaneouslocalizationandmapping:part i. IEEE robotics & automation magazine, 13(2):99–110, 2006. [69] Sebastian Thrun. Simultaneous localization and mapping. In Robotics and cognitive approaches to spatial mapping, pages 13–41. Springer, 2007. [70] Jorge Fuentes-Pacheco, José Ruiz-Ascencio, and Juan Manuel Rendón-Mancha. Visual simultaneous localization and mapping: a survey. Artificial intelligence review, 43(1):55–81, 2015. [71] JohannesLSchonbergerandJan-MichaelFrahm.Structure-from-motionrevisited. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4104–4113, 2016. [72] Bill Triggs, Philip F McLauchlan, Richard I Hartley, and Andrew W Fitzgibbon. Bundle adjustment—a modern synthesis. In International workshop on vision algorithms, pages 298–372. Springer, 1999. [73] Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. Orb-slam: a versatile and accurate monocular slam system. IEEE transactions on robotics, 31(5):1147–1163, 2015. [74] Raul Mur-Artal and Juan D Tardós. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE transactions on robotics, 33(5):1255– 1262, 2017. [75] Angela Dai, Matthias Nießner, Michael Zollöfer, Shahram Izadi, and Christian Theobalt. Bundlefusion: Real-time globally consistent 3d reconstruction using on- the-fly surface re-integration. ACM Transactions on Graphics 2017 (TOG), 2017. [76] Misha Urooj Khan, Syed Azhar Ali Zaidi, Arslan Ishtiaq, Syeda Ume Rubab Bukhari, Sana Samer, and Ayesha Farman. A comparative survey of lidar-slam and lidar based sensor technologies. In 2021 Mohammad Ali Jinnah University International Conference on Computing (MAJICC), pages 1–8. IEEE, 2021. [77] Sean L. Bowman, Nikolay Atanasov, Kostas Daniilidis, and George J. Pappas. Probabilistic data association for semantic slam. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 1722–1729, 2017. [78] RenatoFSalas-Moreno,RichardANewcombe,HaukeStrasdat,PaulHJKelly,and Andrew J Davison. Slam++: Simultaneous localisation and mapping at the level of objects. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1352–1359, 2013. [79] Szymon Rusinkiewicz and Marc Levoy. Efficient variants of the icp algorithm. In Proceedings third international conference on 3-D digital imaging and modeling, pages 145–152. IEEE, 2001. [80] John McCormac, Ankur Handa, Andrew Davison, and Stefan Leutenegger. Se- manticfusion: Dense 3d semantic mapping with convolutional neural networks. In 2017 IEEE International Conference on Robotics and automation (ICRA), pages 4628–4635. IEEE, 2017. [81] Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pages 1520–1528, 2015. [82] YuXiangandDieterFox.Da-rnn:Semanticmappingwithdataassociatedrecurrent neural networks. arXiv preprint arXiv:1703.03098, 2017. [83] Weifeng Chen, Guangtao Shang, Aihong Ji, Chengjun Zhou, Xiyang Wang, Chonghui Xu, Zhenxiong Li, and Kai Hu. An overview on visual slam: From tradition to semantic. Remote Sensing, 14(13):3010, 2022. [84] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015. [85] Zhaowei Cai and Nuno Vasconcelos. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6154–6162, 2018. [86] Tsung-YiLin,PriyaGoyal,RossGirshick,KaimingHe,andPiotrDollár.Focalloss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. [87] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018. [88] Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. Fcos: Fully convolutional one- stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, 2019. [89] Dumitru Erhan, Christian Szegedy, Alexander Toshev, and Dragomir Anguelov. Scalable object detection using deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2147–2154, 2014. [90] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. Ssd: Single shot multibox detector. In European conference on computer vision, pages 21–37. Springer, 2016. [91] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016. [92] Jan Hosang, Rodrigo Benenson, and Bernt Schiele. Learning non-maximum sup- pression. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4507–4515, 2017. [93] Navaneeth Bodla, Bharat Singh, Rama Chellappa, and Larry S Davis. Soft-nms– improving object detection with one line of code. In Proceedings of the IEEE international conference on computer vision, pages 5561–5569, 2017. [94] Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9157–9166, 2019. [95] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre- training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [96] Luciano Floridi and Massimo Chiriatti. Gpt-3: Its nature, scope, limits, and conse- quences. Minds and Machines, 30(4):681–694, 2020. [97] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xi- aohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. [98] Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. De- formable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159, 2020. [99] YuqingWang,ZhaoliangXu,XinlongWang,ChunhuaShen,BaoshanCheng,Hao Shen, and Huaxia Xia. End-to-end video instance segmentation with transformers. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2021. [100] Helen Oleynikova, Zachary Taylor, Marius Fehr, Roland Siegwart, and Juan Ni- eto. Voxblox: Incremental 3d euclidean signed distance fields for on-board mav planning. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1366–1373. IEEE, 2017. [101] Helen Oleynikova, Michael Burri, Zachary Taylor, Juan Nieto, Roland Siegwart, and Enric Galceran. Continuous-time trajectory optimization for online uav re- planning. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5332–5339. IEEE, 2016. [102] Matthias Nießner, Michael Zollhöfer, Shahram Izadi, and Marc Stamminger. Real- time 3d reconstruction at scale using voxel hashing. ACM Transactions on Graphics (ToG), 32(6):1–11, 2013. [103] Steven M LaValle and James J Kuffner. Rapidly-exploring random trees: Progress and prospects: Steven m. lavalle, iowa state university, a james j. kuffner, jr., uni- versity of tokyo, tokyo, japan. Algorithmic and Computational Robotics, pages 303–307, 2001. [104] Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal mo- tion planning. The international journal of robotics research, 30(7):846–894, 2011. [105] David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110, 2004. [106] J. A. Grunert. Das pothenotische problem in erweiterter gestalt nebst bber seine anwendungen in der geodasie. Grunerts Archiv fur Mathematik und Physik, 1:238– 248, 1841. [107] VincentLepetit,FrancescMoreno-Noguer,andPascalFua.EPnP:AnaccurateO(n) solution to the PnP problem. International Journal of Computer Vision, 81(2):155, 2009. [108] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [109] Caner Hazirbas, Lingni Ma, Csaba Domokos, and Daniel Cremers. Fusenet: In- corporating depth into semantic segmentation via fusion-based cnn architecture. In Asian conference on computer vision, pages 213–228. Springer, 2016. [110] Gaku Narita, Takashi Seno, Tomoya Ishikawa, and Yohsuke Kaji. Panopticfusion: Online volumetric semantic mapping at the level of stuff and things. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4205–4212. IEEE, 2019. [111] Christopher Choy, JunYoung Gwak, and Silvio Savarese. 4d spatio-temporal con- vnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3075–3084, 2019.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86251	-
dc.description.abstract	多軸無人機在過去幾年，不論是業界抑或是學界皆已開始被廣泛地運用。業界上的利用例如貨物運輸，農業灌溉或是警務巡邏，而學界上則使用無人機進行資料自動蒐集，工地地圖重建，交通流量監測或是災害救援等等。雖然以上應用皆以顯現無人機良好的機動性，但目前無人機若須進行自動化移動，大多需要仰賴全球定位系統的協助以提供精準的定位。目前許多問題若是發生在全球定位系統訊號不良好甚至是室內環境時，就會導致無人機無法進行自動導航。另外，若是需要在未知的環境下使用無人機，通常需要專業的駕駛員操作無人機以避免無人機的墜毀。在這篇研究中，我們提出了一個完整的系統讓無人機可以快速建置一個擁有六個自由度的障礙物地圖。地圖的建置主要養賴彩色影像，深度影像以及相機的里程計。由障礙物底圖以及里程計，我們的無人機可以自動地在室內導航。在無人機探索室內後，我們的系統利用無人機蒐集的資料建立具有語義資訊的三維地圖。這篇研究提出新的深度模型基於Transformer的架構來將一段序列影像進行語義分割。我們更改了傳統Transformer的注意力機制，使其可以處裡計算量更大的時序性資料。我們進一步將語義分割文的資料重投影以建立三維地圖。經過我們的驗證比較，比起其他模型，我們的模型可以在相似的任務上提升準確度。	zh_TW
dc.description.abstract	Micro Aerial Vehicle (MAV) has started to be utilized by different industries, and companies have used the MAV to deliver merchandise, agriculture spraying, or police patrolling. MAV also gained academic attention, and researchers have used the MAV to collect data for construction site map generation, traffic flow monitoring, or catastrophe rescuing tasks. While those applications have shown that the MAV can travel remotely and unmannedly, most applications highly rely on the Global Positioning System (GPS) to provide accurate position information, which is usually unavailable in an unstructured, crowded indoor environment. Furthermore, MAV usually needs to be controlled by well-trained professionals once it is placed in an unknown environment due to the lack of an environment map which might lead to the failure of autonomous navigation. In this research, we developed a MAV system that built a 6-DoF obstacle map by processing the sensor data from RGB-D and odometry camera to enable the drone autonomously navigate in an indoor environment. After MAV navigates through an unknown environment, it is important for the MAV to generate a semantic 3D map which not only helps the user to investigate the unknown environment but also enables the MAV to provide high-level navigation tasks. To generate the semantic map for MAV, we developed a new model based on Transformer architecture to process sequential data called Sequential-DDETR. Sequential-DDETR is an end-to-end model to generate a sequential segmentation image. We utilized the deformable attention model, which reduces computation significantly compared to the traditional Transformer. Our Sequential-DDETR can calculate attention features across different frames to enhance semantic segmentation performance on sequential images. We also utilized the depth image to perform back projection of sequence semantic segmentation masks to build a Semantic Simultaneous Localization and Mapping(Semantic-SLAM). We have shown that our model can perform better in building Semantic-SLAM than other methods.	en
dc.description.provenance	Made available in DSpace on 2023-03-19T23:44:51Z (GMT). No. of bitstreams: 1 U0001-2908202214512900.pdf: 5798672 bytes, checksum: 8ef4b81764b414d33752bd28486e77ca (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	Acknowledgements i 摘要 ii Abstract iii Contents v List of Figures viii List of Tables ix Chapter 1 Introduction 1 1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 4 Chapter 2 Literature Review 6 2.1 Autonomous MAV . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Satellite Positioning Method . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Cellular Positioning Method . . . . . . . . . . . . . . . . . . . . . 7 2.1.3 Odometry Positioning Method . . . . . . . . . . . . . . . . . . . . 8 2.1.4 Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Semantic SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5 2.2.1 Map representation . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Simultaneous Location and Mapping . . . . . . . . . . . . . . . . . 12 2.2.3 Semantic Simultaneous Location and Mapping . . . . . . . . . . . 13 2.3 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.1 Object Detection Based on CNN . . . . . . . . . . . . . . . . . . . 14 2.3.2 Vision Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 3 Method 17 3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 MAV Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.1 MAV Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.2 PID Controller Tunning . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.3 Robotic Operation System (ROS) . . . . . . . . . . . . . . . . . . . 22 3.3 Obstacle Map and Path Planning . . . . . . . . . . . . . . . . . . . . 24 3.3.1 Voxblox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1.1 TSDF map . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1.2 ESDF map . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3.2 RRT∗ Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 Bundle Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.1 Feature Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.2 Pose Align . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4.3 Map Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5 Sequential Deformable DEtector TRansformer . . . . . . . . . . . . 31 3.5.1 Backbone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5.2 Transformer Encoder . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.5.3 Positional Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.5.4 Transformer Decoder . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.5.5 Sequential Multi-scale Deformable Attention Module . . . . . . . . 34 3.5.6 Segmentation Head . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.5.7 Set Prediction Loss . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.8 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.9 Instance Sequence Detection . . . . . . . . . . . . . . . . . . . . . 38 3.6 Back Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter 4 Experiments 41 4.1 MAV Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2 SDDETR Model Training . . . . . . . . . . . . . . . . . . . . . . . 44 4.2.1 Training Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2.2 Training Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2.3 Image Segmentation Evaluation Detail . . . . . . . . . . . . . . . . 48 4.2.4 Back Projection and 3D Semantic SLAM Evaluation . . . . . . . . 49 4.2.5 Attention Mechanism Visualization . . . . . . . . . . . . . . . . . . 54 4.3 Real World Data Testing . . . . . . . . . . . . . . . . . . . . . . . . 55 Chapter 5 Conclusions and Future Work 57 5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 References 60
dc.language.iso	en
dc.title	室內自動導航無人機系統之同步定位、地圖構建與影像物件偵測	zh_TW
dc.title	Indoor Autonomous Navigation Drone System with Semantic-SLAM based on SD-DETR	en
dc.type	Thesis
dc.date.schoolyear	110-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	簡韶逸(Shao-Yi Chien),林之謙(Jacob J. Lin)
dc.subject.keyword	室內自動化無人機,深度學習,電腦視覺,時序性模型,語義分割,	zh_TW
dc.subject.keyword	Autonomous Indoor MAV,Deep Learning,Computer Vision,Vision Transformer,Semantic SLAM,	en
dc.relation.page	75
dc.identifier.doi	10.6342/NTU202202930
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2022-08-30
dc.contributor.author-college	工學院	zh_TW
dc.contributor.author-dept	土木工程學研究所	zh_TW
dc.date.embargo-lift	2022-08-31	-
顯示於系所單位：	土木工程學系

文件中的檔案：

檔案	大小	格式
U0001-2908202214512900.pdf	5.66 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。