應用輕量級視覺里程計之自主無人機於無 GPS 環境: 實機飛行驗證

詹承諺; Cheng-Yen Chan

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98468

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	陳俊杉	zh_TW
dc.contributor.advisor	Chuin-Shan Chen	en
dc.contributor.author	詹承諺	zh_TW
dc.contributor.author	Cheng-Yen Chan	en
dc.date.accessioned	2025-08-14T16:14:08Z	-
dc.date.available	2025-08-15	-
dc.date.copyright	2025-08-14	-
dc.date.issued	2025	-
dc.date.submitted	2025-07-31	-
dc.identifier.citation	[1] Guilherme Potje, Felipe Cadar, André Araujo, Renato Martins, and Erickson R Nascimento. Xfeat: Accelerated features for lightweight image matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2682–2691, 2024. [2] Milan Erdelj, Enrico Natalizio, Kaushik R Chowdhury, and Ian F Akyildiz. Help from the sky: Leveraging uavs for disaster management. IEEE Pervasive Computing, 16(1):24–32, 2017. [3] JaihyunLee.Optimizationofamodulardronedeliverysystem.In2017annualIEEE international systems conference (SysCon), pages 1–8. IEEE, 2017. [4] Zhefan Xu, Baihan Chen, Xiaoyang Zhan, Yumeng Xiu, Christopher Suzuki, and Kenji Shimada. A vision-based autonomous uav inspection framework for unknown tunnel construction sites with dynamic obstacles. IEEE Robotics and Automation Letters, 8(8):4983–4990, 2023. [5] David Nistér, Oleg Naroditsky, and James Bergen. Visual odometry. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 1, pages I–I. Ieee, 2004. [6] Ji Zhang, Sanjiv Singh, et al. Loam: Lidar odometry and mapping in real-time. In Robotics: Science and systems, volume 2, pages 1–9. Berkeley, CA, 2014. [7] Saad Mokssit, Daniel Bonilla Licea, Bassma Guermah, and Mounir Ghogho. Deep learning techniques for visual slam: A survey. IEEE Access, 11:20026–20050, 2023. [8] Changhao Chen, Bing Wang, Chris Xiaoxuan Lu, Niki Trigoni, and Andrew Markham. Deep learning for visual localization and mapping: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2023. [9] JiayiMa,XingyuJiang,AoxiangFan,JunjunJiang,andJunchiYan.Imagematching from handcrafted to deep features: A survey. Int. J. Comput. Vision, 129(1):23–79, January 2021. [10] Tong Qin, Jie Pan, Shaozu Cao, and Shaojie Shen. A general optimization-based framework for local odometry estimation with multiple sensors, 2019. [11] Tong Qin, Shaozu Cao, Jie Pan, and Shaojie Shen. A general optimization-based framework for global pose estimation with multiple sensors, 2019. [12] Stefan Leutenegger, Simon Lynen, Michael Bosse, Roland Siegwart, and Paul Fur- gale. Keyframe-based visual–inertial odometry using nonlinear optimization. The International Journal of Robotics Research, 34(3):314–334, 2015. [13] Tixiao Shan, Brendan Englot, Drew Meyers, Wei Wang, Carlo Ratti, and Daniela Rus. Lio-sam: Tightly-coupled lidar inertial odometry via smoothing and map- ping. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5135–5142, 2020. [14] Ji Zhang and Sanjiv Singh. Visual-lidar odometry and mapping: low-drift, ro- bust, and fast. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 2174–2181, 2015. [15] Panagiotis Radoglou-Grammatikis, Panagiotis Sarigiannidis, Thomas Lagkas, and Ioannis Moscholios. A compilation of uav applications for precision agriculture. Computer Networks, 172:107148, 2020. [16] Han Liang, Seong-Cheol Lee, Woosung Bae, Jeongyun Kim, and Suyoung Seo. To- wards uavs in construction: Advancements, challenges, and future directions for monitoring and inspection. Drones, 7, 2023. [17] Davide Cristiani, Filippo Bottonelli, Angelo Trotta, and Marco Di Felice. Inventory management through mini-drones: Architecture and proof-of-concept implementa- tion. In 2020 IEEE 21st International Symposium on ”A World of Wireless, Mobile and Multimedia Networks” (WoWMoM), pages 317–322, 2020. [18] Janosch Nikolic, Michael Burri, Joern Rehder, Stefan Leutenegger, Christoph Huerzeler, and Roland Siegwart. A uav system for inspection of industrial facili- ties. In 2013 IEEE Aerospace Conference, pages 1–8, 2013. [19] Ebtehal Turki Alotaibi, Shahad Saleh Alqefari, and Anis Koubaa. Lsar: Multi-uav collaboration for search and rescue missions. IEEE Access, 7:55817–55832, 2019. [20] David G. Lowe. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60(2):91–110, November 2004. [21] Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. Surf: speeded up robust features. In Proceedings of the 9th European Conference on Computer Vision - Volume Part I, ECCV’06, page 404–417, Berlin, Heidelberg, 2006. Springer-Verlag. [22] MichaelCalonder,VincentLepetit,ChristophStrecha,andPascalFua.Brief:Binary robust independent elementary features. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, pages 778–792. Springer, 2010. [23] Edward Rosten and Tom Drummond. Machine learning for high-speed corner de- tection. In Computer Vision–ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, May 7-13, 2006. Proceedings, Part I 9, pages 430–443. Springer, 2006. [24] Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. Orb: An efficient alternative to sift or surf. In 2011 International Conference on Computer Vision, pages 2564–2571, 2011. [25] KwangMooYi,EduardTrulls,VincentLepetit,andPascalFua.Lift:Learnedinvari- ant feature transform. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI 14, pages 467–483. Springer, 2016. [26] Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. Superpoint: Self- supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 224–236, 2018. [27] Michał Tyszkiewicz, Pascal Fua, and Eduard Trulls. Disk: Learning local fea- tures with policy gradient. Advances in Neural Information Processing Systems, 33:14254–14265, 2020. [28] Pierre Gleize, Weiyao Wang, and Matt Feiszli. Silk: Simple learned keypoints. InProceedings of the IEEE/CVF international conference on computer vision, pages 22499–22508, 2023. [29] Raúl Mur-Artal and Juan D. Tardós. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics, 33(5):1255– 1262, 2017. [30] Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. M. Montiel, and Juan D. Tardós. Orb-slam3: An accurate open-source library for visual, visual–in- ertial, and multimap slam. IEEE Transactions on Robotics, 37(6):1874–1890, 2021. [31] Tong Qin, Peiliang Li, and Shaojie Shen. Vins-mono: A robust and versatile monoc- ular visual-inertial state estimator. IEEE Transactions on Robotics, 34(4):1004– 1020, 2018. [32] Tong Qin and Shaojie Shen. Online temporal calibration for monocular visual- inertial systems. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3662–3669. IEEE, 2018. [33] Zhangzhen Zhao, Tao Song, Bin Xing, Yu Lei, and Ziqin Wang. Pli-vins: Visual- inertial slam based on point-line feature fusion in indoor environment. Sensors, 22(14), 2022. [34] Sen Wang, Ronald Clark, Hongkai Wen, and Niki Trigoni. Deepvo: Towards end- to-end visual odometry with deep recurrent convolutional neural networks. In 2017 IEEE international conference on robotics and automation (ICRA), pages 2043– 2050. IEEE, 2017. [35] Zachary Teed and Jia Deng. Droid-slam: Deep visual slam for monocular, stereo,and rgb-d cameras. Advances in neural information processing systems, 34:16558– 16569, 2021. [36] Hudson Martins Silva Bruno and Esther Luna Colombini. Lift-slam: A deep- learning feature-based monocular visual slam method. Neurocomputing, 455:97– 110, 2021. [37] Dongjiang Li, Xuesong Shi, Qiwei Long, Shenghui Liu, Wei Yang, Fangshi Wang, Qi Wei, and Fei Qiao. Dxslam: A robust and efficient visual slam system with deep features. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 4958–4965. IEEE Press, 2020. [38] Zhang Xiao and Shuaixin Li. A real-time, robust and versatile visual-slam frame- work based on deep learning networks. arXiv preprint arXiv:2405.03413, 2024. [39] Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Pollefeys. Lightglue: Lo- cal feature matching at light speed. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 17627–17638, 2023. [40] Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, Andrew Y Ng, et al. Ros: an open-source robot operating system. In ICRA workshop on open source software, volume 3, page 5. Kobe, 2009. [41] Paul Furgale, Joern Rehder, and Roland Siegwart. Unified temporal and spatial cal- ibration for multi-sensor systems. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1280–1286. IEEE, 2013. [42] AnisKoubâa,AzzaAllouch,MaramAlajlan,YasirJaved,AbdelfettahBelghith,and Mohamed Khalgui. Micro air vehicle link (mavlink) in a nutshell: A survey. IEEE Access, 7:87658–87680, 2019. [43] Berthold K.P. Horn and Brian G. Schunck. Determining optical flow. Artificial Intelligence, 17(1):185–203, 1981. [44] BruceD.LucasandTakeoKanade.Aniterativeimageregistrationtechniquewithan application to stereo vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI’81, page 674–679, San Francisco, CA, USA, 1981. Morgan Kaufmann Publishers Inc. [45] Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. 1960. [46] RudolphEKalmanandRichardSBucy.Newresultsinlinearfilteringandprediction theory. 1961. [47] Xin Zhou, Zhepei Wang, Hongkai Ye, Chao Xu, and Fei Gao. Ego-planner: An esdf- free gradient-based local planner for quadrotors. IEEE Robotics and Automation Letters, 6(2):478–485, 2020. [48] Michael Burri, Janosch Nikolic, Pascal Gohl, Thomas Schneider, Joern Rehder, Sammy Omari, Markus W Achtelik, and Roland Siegwart. The euroc micro aerial vehicle datasets. The International Journal of Robotics Research, 2016. [49] A.V. Oppenheim. Discrete-Time Signal Processing. Pearson education signal pro- cessing series. Pearson Education, 1999. [50] F Landis Markley, Yang Cheng, John L Crassidis, and Yaakov Oshman. Averaging quaternions. Journal of Guidance, Control, and Dynamics, 30(4):1193–1197, 2007.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98468	-
dc.description.abstract	自主無人機（UAV）在複雜的室內或無 GPS 環境中運作時，亟需具備穩定且準確的定位能力。傳統視覺里程計（VO）系統依賴手工(hand-crafted)的特徵偵測與追蹤方法，但在低紋理或動態場景中，往往面臨定位效能下降的問題。本論文提出 XVINS，一個專為無人機設計的輕量化視覺里程計框架，結合深度學習特徵（XFeat），以提升前端追蹤流程的穩定性及準確性。本方法採用混合式設計：首先利用傳統光流法進行高效率特徵追蹤，當傳統追蹤無法維持足夠特徵時，則透過XFeat 深度特徵進行補充。經由模擬資料集與實際室內飛行場景的實驗結果顯示，所提出的 XVINS 系統在軌跡穩定性與環境適應性方面，均顯著優於傳統視覺里程計方法。為進一步確保自主飛行的穩定性與可靠性，本研究亦整合光流計，並系統性地調整控制參數，以提升在複雜環境下的飛行控制能力。透過模擬資料集與實際室內飛行場景的實驗結果顯示，所提出的 XVINS 系統結合上述強化措施後，在軌跡穩定性與面對環境挑戰的適應性方面，均顯著優於傳統視覺里程計方法。	zh_TW
dc.description.abstract	Autonomous unmanned aerial vehicles (UAVs) require robust and accurate localization to operate in challenging indoor or GPS-denied environments. Traditional visual odometry (VO) systems, which rely on hand-crafted feature detection and tracking, often struggle in low-texture or dynamic scenes, leading to degraded localization performance. In this thesis, we present XVINS, a lightweight visual odometry framework for UAVs that integrates deep learning-based features (XFeat) to enhance both the robustness and stability of the front-end tracking pipeline. Our method employs a hybrid approach: conventional optical flow is initially used for efficient feature tracking, while deep features from XFeat are selectively introduced to supplement the system when conventional tracking is insufficient. To further ensure stable and reliable autonomous flight, we also incorporate an onboard optical flow sensor and systematically tune control parameters to improve flight control under challenging conditions. Experimental results in both simulated datasets and real-world indoor flight scenarios demonstrate that the proposed XVINS system, together with these additional enhancements, significantly outperforms traditional VO in terms of trajectory stability and resilience to environmental challenges.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-14T16:14:08Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-14T16:14:08Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements iii 摘要 v Abstract vii Contents ix List of Illustrations xiii List of Tables xv Denotation xvii Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.2 Research Objectives 2 1.3 Organization of Thesis 3 Chapter 2 Literature Review 5 2.1 Autonomous UAV 5 2.1.1 Localization 6 2.1.2 Indoor UAV Flight Control in GPS-Denied Environments 7 2.1.3 Application 7 2.2 Image Features for Visual Localization 8 2.2.1 Traditional Features 8 2.2.2 Deep learning-based Features 9 2.3 Visual Odometry (VO) 10 2.3.1 Traditional Visual Odometry 10 2.3.2 Deep Visual Odometry 11 2.3.3 Visual Odometry with Deep Features 12 Chapter 3 Methodology 15 3.1 Research Overview 15 3.2 Visual Odometry with Deep Features 16 3.2.1 XVINS System Overview 17 3.2.2 Sensor Calibration 18 3.2.3 Deep Feature Tracking using XFeat 19 3.3 UAV Configuration 21 3.3.1 UAV Hardware 21 3.3.2 ROS Architecture 23 3.3.3 PID Tuning 24 3.3.4 ArduPilot Settings 26 3.4 Indoor UAV Flight Control 27 3.4.1 Sensor State Estimation 27 3.4.2 Extended Kalman Filter 3 (EKF3) 29 3.4.3 Sensor Frames and TF Transformation 29 3.4.4 MAVROS Communication 32 3.4.5 Basic Mission Control 33 3.5 Path Planning 34 3.5.1 EGO‐Planner 35 3.5.2 Path Planning Integration 35 Chapter 4 Experiments 39 4.1 Sensor Preprocessing 39 4.2 XVINS Performance 41 4.2.1 Xfeat Matching Performance 41 4.2.2 Trajectory Accuracy Evaluation on EuRoC 43 4.3 PID Parameters Tuning and Results 44 4.4 Real-World Autonomous UAV Experiments 46 4.4.1 Experimental Setup and Task Definition 46 4.4.2 Real-World Performance Comparison 48 4.5 Path Planning 52 4.6 Limitations in Low-Texture Environments 53 Chapter 5 Conclusions and Future Work 55 5.1 Conclusions 55 5.2 Future Work 55 References 57 Appendix A - IMU Filtering 65 A.1 IMU Filtering Methods 65 A.2 Experimental Results 66	-
dc.language.iso	en	-
dc.subject	視覺里程計	zh_TW
dc.subject	未知室內環境之自主無人機	zh_TW
dc.subject	局部路徑規劃	zh_TW
dc.subject	基於深度學習之影像局部特徵	zh_TW
dc.subject	Autonomous UAV in Unknown Indoor Environments	en
dc.subject	Local Path Planning	en
dc.subject	Visual Odometry	en
dc.subject	Deep learning-based Local Image Features	en
dc.title	應用輕量級視覺里程計之自主無人機於無 GPS 環境: 實機飛行驗證	zh_TW
dc.title	Lightweight Visual Odometry with Deep Features for Autonomous UAVs in GPS-Denied Environments: Real-World Flight Validation	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	林沛群;林之謙	zh_TW
dc.contributor.oralexamcommittee	Pei-Chun Lin;Je-Chian Lin	en
dc.subject.keyword	未知室內環境之自主無人機,視覺里程計,基於深度學習之影像局部特徵,局部路徑規劃,	zh_TW
dc.subject.keyword	Autonomous UAV in Unknown Indoor Environments,Visual Odometry,Deep learning-based Local Image Features,Local Path Planning,	en
dc.relation.page	66	-
dc.identifier.doi	10.6342/NTU202502789	-
dc.rights.note	未授權	-
dc.date.accepted	2025-08-02	-
dc.contributor.author-college	工學院	-
dc.contributor.author-dept	土木工程學系	-
dc.date.embargo-lift	N/A	-
Appears in Collections:	土木工程學系

Files in This Item:

File	Size	Format
ntu-113-2.pdf Restricted Access	30.15 MB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets