Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80929
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪一平(Yi-Ping Hung)
dc.contributor.authorRong-Rong Zhangen
dc.contributor.author張蓉蓉zh_TW
dc.date.accessioned2022-11-24T03:22:22Z-
dc.date.available2021-11-08
dc.date.available2022-11-24T03:22:22Z-
dc.date.copyright2021-11-08
dc.date.issued2021
dc.date.submitted2021-10-03
dc.identifier.citation[1] J. Shotton, B. Glocker, C. Zach, S. Izadi, A. Criminisi, and A. Fitzgibbon, “Scene coordinate regression forests for camera relocalization in rgb-d images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2930– 2937, 2013. [2] L. Kneip, D. Scaramuzza, and R. Siegwart, “A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation,” in CVPR 2011, pp. 2969–2976, IEEE, 2011. [3] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981. [4] D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self-supervised interest point detection and description,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 224–236, 2018. [5] M. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler, “D2-net: A trainable cnn for joint description and detection of local features,” in Proceedings of the ieee/cvf conference on computer vision and pattern recognition, pp. 8092–8101, 2019. [6] J. Revaud, C. De Souza, M. Humenberger, and P. Weinzaepfel, “R2d2: Reliable and repeatable detector and descriptor,” Advances in neural information processing systems, vol. 32, pp. 12405–12415, 2019. [7] Z. Luo, L. Zhou, X. Bai, H. Chen, J. Zhang, Y. Yao, S. Li, T. Fang, and L. Quan, “Aslfeat: Learning local features of accurate shape and localization,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6589– 6598, 2020. [8] T. Cavallari, L. Bertinetto, J. Mukhoti, P. Torr, and S. Golodetz, “Let’s take this online: Adapting scene coordinate regression network predictions for online rgb-d camera relocalisation,” in 2019 International Conference on 3D Vision (3DV), pp. 564–573, IEEE, 2019. [9] X. Li, J. Ylioinas, J. Verbeek, and J. Kannala, “Scene coordinate regression with angle-based reprojection loss for camera relocalization,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0–0, 2018. [10] E. Brachmann, A. Krull, S. Nowozin, J. Shotton, F. Michel, S. Gumhold, and C. Rother, “Dsac-differentiable ransac for camera localization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6684–6692, 2017. [11] E. Brachmann and C. Rother, “Learning less is more-6d camera localization via 3d surface regression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4654–4662, 2018. [12] E. Brachmann and C. Rother, “Visual camera re-localization from rgb and rgb-d images using dsac,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. [13] A. Guzman-Rivera, P. Kohli, B. Glocker, J. Shotton, T. Sharp, A. Fitzgibbon, and S. Izadi, “Multi-output learning for camera relocalization,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1114–1121, 2014. [14] J. Valentin, M. Nießner, J. Shotton, A. Fitzgibbon, S. Izadi, and P. H. Torr, “Exploiting uncertainty in regression forests for accurate camera relocalization,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4400– 4408, 2015. [15] L. Meng, J. Chen, F. Tung, J. J. Little, J. Valentin, and C. W. de Silva, “Backtracking regression forests for accurate camera relocalization,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6886–6893, IEEE, 2017. [16] L. Meng, F. Tung, J. J. Little, J. Valentin, and C. W. de Silva, “Exploiting points and lines in regression forests for rgb-d camera relocalization,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6827–6834, IEEE, 2018. [17] T. Cavallari, S. Golodetz, N. A. Lord, J. Valentin, L. Di Stefano, and P. H. Torr, “On-the-fly adaptation of regression forests for online camera relocalisation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4457–4466, 2017. [18] T. Cavallari, S. Golodetz, N. A. Lord, J. Valentin, V. A. Prisacariu, L. Di Stefano, and P. H. Torr, “Real-time rgb-d camera pose estimation in novel scenes using a relocalisation cascade,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 10, pp. 2465–2477, 2019. [19] Y. Shavit and R. Ferens, “Introduction to camera pose estimation with deep learning,” arXiv preprint arXiv:1907.05272, 2019. [20] A. Kendall, M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real-time 6-dof camera relocalization,” in Proceedings of the IEEE international conference on computer vision, pp. 2938–2946, 2015. [21] A. Kendall and R. Cipolla, “Modelling uncertainty in deep learning for camera relocalization,” in 2016 IEEE international conference on Robotics and Automation (ICRA), pp. 4762–4769, IEEE, 2016. [22] A. Kendall and R. Cipolla, “Geometric loss functions for camera pose regression with deep learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5974–5983, 2017. [23] R. Li, Q. Liu, J. Gui, D. Gu, and H. Hu, “Indoor relocalization in challenging environments with dual-stream convolutional neural networks,” IEEE Transactions on Automation Science and Engineering, vol. 15, no. 2, pp. 651–662, 2017. [24] S. Brahmbhatt, J. Gu, K. Kim, J. Hays, and J. Kautz, “Geometry-aware learning of maps for camera localization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2616–2625, 2018. [25] M. Tian, Q. Nie, and H. Shen, “3d scene geometry-aware constraint for camera localization with deep learning,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 4211–4217, IEEE, 2020. [26] A. Valada, N. Radwan, and W. Burgard, “Deep auxiliary learning for visual localization and odometry,” in 2018 IEEE international conference on robotics and automation (ICRA), pp. 6939–6946, IEEE, 2018. [27] N. Radwan, A. Valada, and W. Burgard, “Vlocnet++: Deep multitask learning for semantic visual localization and odometry,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 4407–4414, 2018. [28] A. Torii, J. Sivic, and T. Pajdla, “Visual localization by linear combination of image descriptors,” in 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 102–109, IEEE, 2011. [29] Z. Laskar, I. Melekhov, S. Kalia, and J. Kannala, “Camera relocalization by computing pairwise relative poses using convolutional neural network,” in Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 929–938, 2017. [30] V. Balntas, S. Li, and V. Prisacariu, “Relocnet: Continuous metric learning relocalisation using neural nets,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 751–767, 2018. [31] S. Saha, G. Varma, and C. Jawahar, “Improved visual relocalization by discovering anchor points,” arXiv preprint arXiv:1811.04370, 2018. [32] P.-E. Sarlin, C. Cadena, R. Siegwart, and M. Dymczyk, “From coarse to fine: Robust hierarchical localization at large scale,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12716–12725, 2019. [33] T. Sattler, Q. Zhou, M. Pollefeys, and L. Leal-Taixe, “Understanding the limitations of cnn-based absolute camera pose regression,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3302–3312, 2019. [34] A. Levin, D. Lischinski, and Y. Weiss, “Colorization using optimization,” in ACM SIGGRAPH 2004 Papers, pp. 689–694, 2004. [35] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in European conference on computer vision, pp. 746– 760, Springer, 2012. [36] S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, et al., “Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th annual ACM symposium on User interface software and technology, pp. 559–568, 2011. [37] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon, “Kinectfusion: Real-time dense surface mapping and tracking,” in 2011 10th IEEE international symposium on mixed and augmented reality, pp. 127–136, IEEE, 2011. [38] J. Valentin, A. Dai, M. Nießner, P. Kohli, P. Torr, S. Izadi, and C. Keskin, “Learning to navigate the energy landscape,” arXiv preprint arXiv:1603.05772, 2016. [39] M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger, “Real-time 3d reconstruction at scale using voxel hashing,” ACM Transactions on Graphics (ToG), vol. 32, no. 6, pp. 1–11, 2013. [40] W. Maddern, G. Pascoe, C. Linegar, and P. Newman, “1 year, 1000 km: The oxford robotcar dataset,” The International Journal of Robotics Research, vol. 36, no. 1, pp. 3–15, 2017. [41] M. Hu, S. Wang, B. Li, S. Ning, L. Fan, and X. Gong, “Penet: Towards precise and efficient image guided depth completion,” arXiv preprint arXiv:2103.00783, 2021. [42] J. Park, K. Joo, Z. Hu, C.-K. Liu, and I. So Kweon, “Non-local spatial propagation network for depth completion,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16, pp. 120– 136, Springer, 2020. [43] W. Van Gansbeke, D. Neven, B. De Brabandere, and L. Van Gool, “Sparse and noisy lidar completion with rgb guidance and uncertainty,” in 2019 16th international conference on machine vision applications (MVA), pp. 1–6, IEEE, 2019. [44] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 652–660, 2017. [45] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” arXiv preprint arXiv:1706.02413, 2017.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80929-
dc.description.abstract攝影機絕對位姿回歸是一類單一影像攝影機定位的方法。它將三維場景的資訊編碼於端到端的神經網路中,因此它估計位姿所需的時間少於基於結構的定位方法。在本論文中,我提出了一個使用 RGB-D 影像的絕對位姿回歸方法,旨在融合顏色和深度資訊以達到更準確的定位效果。我使用了雙流網路架構來分別處理彩色影像和深度影像,並結合人工設計的基底位姿來減輕網路受限於訓練資料中運動軌跡的影響。為了和現有的絕對位姿回歸方法比較,我在室內和室外資料集上評估了此方法的定位表現。實驗結果顯示此方法改善了效能。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T03:22:22Z (GMT). No. of bitstreams: 1
U0001-1409202121575900.pdf: 16081195 bytes, checksum: 274d80b4cb458cd112aa721bd03a4edb (MD5)
Previous issue date: 2021
en
dc.description.tableofcontents摘要 i Abstract ii Contents iii List of Figures v List of Tables vii Chapter 1 Introduction 1 Chapter 2 Related Works 3 2.1 End-to-End Approaches 4 2.1.1 Absolute Camera Pose Regression 4 2.1.2 Multi-task Regression 10 2.2 Step-by-Step Approaches 10 2.2.1 Relative Camera Pose Regression with Image Retrieval 10 2.2.2 Structured-based Localization with Scene Coordinate Regression 11 2.2.3 Structured-based Localization with Image Retrieval 13 Chapter 3 Proposed Method 14 3.1 A Theory of Absolute Camera Pose Regression 14 3.1.1 Theoretical Model 14 3.1.2 Rotation Formalisms 15 3.1.3 Base Poses 16 3.2 Network Architecture 18 3.3 Loss Function 19 3.4 Training Mechanism 22 3.5 Implementation Details 22 3.5.1 Depth Completion 23 3.5.2 Selection of Handcrafted Base Poses 24 Chapter 4 Experiments 26 4.1 Datasets 26 4.2 Comparison with Prior Methods 28 4.3 Ablation Studies 30 4.3.1 Effect of Depth Completion 31 4.3.2 Comparison of Different Base Poses 32 4.3.3 Comparison of Different Network Architectures 34 Chapter 5 Conclusions 35 Chapter 6 Future Work 36 References 37
dc.language.isoen
dc.subject攝影機位姿估計zh_TW
dc.subject攝影機絕對位姿回歸zh_TW
dc.subject單一影像攝影機定位zh_TW
dc.subject雙流網路zh_TW
dc.subject人工設計的基底位姿zh_TW
dc.subjectCamera pose estimationen
dc.subjectAbsolute camera pose regressionen
dc.subjectSingle-shot camera localizationen
dc.subjectDual-stream networken
dc.subjectHandcrafted base posesen
dc.title使用 RGB-D 雙流網路與人工設計的基底位姿之攝影機絕對位姿回歸zh_TW
dc.titleAbsolute Camera Pose Regression Using RGB-D Dual-Stream Network and Handcrafted Base Posesen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳祝嵩(Hsin-Tsai Liu),陳冠文(Chih-Yang Tseng)
dc.subject.keyword攝影機絕對位姿回歸,單一影像攝影機定位,雙流網路,人工設計的基底位姿,攝影機位姿估計,zh_TW
dc.subject.keywordAbsolute camera pose regression,Single-shot camera localization,Dual-stream network,Handcrafted base poses,Camera pose estimation,en
dc.relation.page43
dc.identifier.doi10.6342/NTU202103180
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2021-10-05
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1409202121575900.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
15.7 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved