Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58938
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪一平(Yi-Ping Hung)
dc.contributor.authorChun-Hsien Lien
dc.contributor.author李俊賢zh_TW
dc.date.accessioned2021-06-16T08:39:53Z-
dc.date.available2020-07-17
dc.date.copyright2020-07-17
dc.date.issued2020
dc.date.submitted2020-07-09
dc.identifier.citation[1] A. Kendall, M. Grimes, and R. Cipolla, 'Posenet: A convolutional network for real-time 6-dof camera relocalization,' in Proceedings of the IEEE international conference on computer vision, 2015, pp. 2938-2946.
[2] F. Walch, C. Hazirbas, L. Leal-Taixe, T. Sattler, S. Hilsenbeck, and D. Cremers, 'Image-based localization using lstms for structured feature correlation,' in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 627-637.
[3] S. Hochreiter and J. Schmidhuber, 'Long short-term memory,' Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[4] S. Brahmbhatt, J. Gu, K. Kim, J. Hays, and J. Kautz, 'Geometry-aware learning of maps for camera localization,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2616-2625.
[5] Z. Laskar, I. Melekhov, S. Kalia, and J. Kannala, 'Camera relocalization by computing pairwise relative poses using convolutional neural network,' in Proceedings of the IEEE International Conference on Computer Vision Workshops, 2017, pp. 929-938.
[6] S. Saha, G. Varma, and C. Jawahar, 'Improved visual relocalization by discovering anchor points,' arXiv preprint arXiv:1811.04370, 2018.
[7] A. Kendall and R. Cipolla, 'Geometric loss functions for camera pose regression with deep learning,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5974-5983.
[8] N. Snavely, S. M. Seitz, and R. Szeliski, 'Photo tourism: exploring photo collections in 3D,' in ACM Siggraph 2006 Papers, 2006, pp. 835-846.
[9] T. Sattler, M. Havlena, F. Radenovic, K. Schindler, and M. Pollefeys, 'Hyperpoints and fine vocabularies for large-scale location recognition,' in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2102-2110.
[10] T. Sattler, B. Leibe, and L. Kobbelt, 'Efficient effective prioritized matching for large-scale image-based localization,' IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 9, pp. 1744-1756, 2016.
[11] Y. Li, N. Snavely, D. Huttenlocher, and P. Fua, 'Worldwide pose estimation using 3d point clouds,' in European conference on computer vision, 2012: Springer, pp. 15-29.
[12] L. Svärm, O. Enqvist, F. Kahl, and M. Oskarsson, 'City-scale localization for cameras with known vertical direction,' IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 7, pp. 1455-1461, 2016.
[13] B. Zeisl, T. Sattler, and M. Pollefeys, 'Camera pose voting for large-scale image-based localization,' in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2704-2712.
[14] S. Cao and N. Snavely, 'Graph-based discriminative learning for location recognition,' in Proceedings of the ieee conference on computer vision and pattern recognition, 2013, pp. 700-707.
[15] A. Irschara, C. Zach, J.-M. Frahm, and H. Bischof, 'From structure-from-motion point clouds to fast location recognition,' in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009: IEEE, pp. 2599-2606.
[16] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, 'NetVLAD: CNN architecture for weakly supervised place recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5297-5307.
[17] F. Radenović, G. Tolias, and O. Chum, 'Fine-tuning CNN image retrieval with no human annotation,' IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 7, pp. 1655-1668, 2018.
[18] A. Torii, R. Arandjelovic, J. Sivic, M. Okutomi, and T. Pajdla, '24/7 place recognition by view synthesis,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1808-1817.
[19] C. Szegedy et al., 'Going deeper with convolutions,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9.
[20] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, 'Learning deep features for scene recognition using places database,' in Advances in neural information processing systems, 2014, pp. 487-495.
[21] A. Paszke et al., 'Automatic differentiation in pytorch,' 2017.
[22] S. Izadi et al., 'KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera,' in Proceedings of the 24th annual ACM symposium on User interface software and technology, 2011, pp. 559-568.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58938-
dc.description.abstract利用深度學習機制來達到以影像為基礎的定位是近幾年定位研究的趨勢,這是由於深度學習架構可以利用圖像處理器平行化之後快速運行,來達到實時的效果,同時不需要隨著時間而消耗更多的記憶體資源,只需要讓固定大小的深度模型去認知一個場景的內容,就可以用深度模型中的卷積層去模擬傳統定位的幾何運算。在本論文中,我們主要開發一個端到端的定位系統,透過整合特徵加權機制以及長短期記憶模型,並搭配可以自動學習位置與角度之間的尺度比重的損失函數,來達到好的定位效果。在此之後,我們呈現了此系統在室外資料集以及室內資料集的定位表現,並將結果跟經典的兩個深度學習定位系統做比較。其結果顯示,在中位數誤差表現上,我們不管在室外資料集或是室內資料集上都比前兩者還要更進步,同時我們的誤差軌跡圖也顯示出我們有不錯的提升。最後,我們的運行速度也能保持在平均每秒九十幀以上,是足夠做實時運行的。zh_TW
dc.description.abstractThe use of deep learning mechanisms to achieve image-based localization is the trend of localization research in recent years. This is because the deep learning architecture can run more quickly after parallelization of GPU to achieve real-time running, and does not need to consume more memory resources over time. It only needs a fixed-size model to recognize the contents of a scene, and simulates the traditional localization geometric operations by the aid of the convolutional layers in the model. In this paper, we mainly develop an end-to-end localization system, which integrates the feature-weighted mechanism and long short-term memory models, and uses a loss function that can automatically learns the scale weights between position and rotation to achieve a good localization result. After this, we presented the localization performance of this system in outdoor dataset and indoor dataset, and compare the results with two classic deep learning localization system. The results show that in terms of the median error performance, we are more advanced than the former in both outdoor and indoor. At the same time, our error trajectory shows that we have a good improvement. Finally, our running speed can also reach more than ninety fps on average, which is sufficient for real-time operation.en
dc.description.provenanceMade available in DSpace on 2021-06-16T08:39:53Z (GMT). No. of bitstreams: 1
U0001-0807202021152700.pdf: 3520570 bytes, checksum: fd941aaf8729849fd02130c8d0659c3b (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents口試委員會審定書 #
誌謝 i
中文摘要 ii
ABSTRACT iii
CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES viiii
Chapter 1 Introduction 1
Chapter 2 Related Works 3
2.1 Structured-Based Localization 3
2.2 Image Retrieval Localization 4
2.3 Deep Learning Localization 4
2.3.1 Absolute Camera Pose Regression 5
2.3.2 Relative Camera Pose Regression 6
Chapter 3 Proposed Featured-Weighted Model 8
3.1 Congenital Defects of LSTM 8
3.2 Network Architecture 10
3.2.1 Edited PoseLSTM 11
3.2.2 Feature-Weighted Mechanism 12
3.3 Loss Function 13
3.3.1 Shortcoming of the Loss Function used by PoseNet 14
3.3.2 Learn Weight Loss Function 15
3.4 Data Augmentation and Hyper-parameters 16
3.4.1 Data Augmentation 17
3.4.2 Hyper-parameters 17
Chapter 4 Experiments 19
4.1 Datasets 19
4.1.1 Outdoor Dataset 17
4.1.2 Indoor Dataset 20
4.2 Outdoor Analysis 20
4.2.1 Median Error 20
4.2.2 Cumulative Error 21
4.2.3 Trajectory Visualization 23
4.2.4 Ablation Study 17
4.3 Indoor Analysis 24
4.3.1 Median Error 25
4.3.2 Cumulative Error 25
4.3.3 Trajectory Visualization 26
4.3.4 Ablation Study 27
4.4 Time Analysis 28
Chapter 5 Conclusions 29
Chapter 5 Future Works 30
APPENDICES 31
APPENDIX A Cumulative Error 32
APPENDIX B Trajectory Visualization 35
REFERENCES 40
dc.language.isoen
dc.subject相機姿態估計zh_TW
dc.subject特徵加權機制zh_TW
dc.subject深度學習zh_TW
dc.subject基於影像的定位zh_TW
dc.subjectImage-based localizationen
dc.subjectDeep learningen
dc.subjectFeature-weighted mechanismen
dc.subjectCamera pose estimationen
dc.title使用特徵加權機制調整特徵相關性的影像定位zh_TW
dc.titleImage-based Localization using Feature-Weighted Mechanism for Adjusting Feature Correlationen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee莊永裕(Yung-Yu Chuang),王鈺強(Yu-Chiang Wang),陳祝嵩(Chu-Song Chen),陳冠文(Kuan-Wen Chen)
dc.subject.keyword基於影像的定位,深度學習,特徵加權機制,相機姿態估計,zh_TW
dc.subject.keywordImage-based localization,Deep learning,Feature-weighted mechanism,Camera pose estimation,en
dc.relation.page42
dc.identifier.doi10.6342/NTU202001396
dc.rights.note有償授權
dc.date.accepted2020-07-10
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-0807202021152700.pdf
  未授權公開取用
3.44 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved