Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79250
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor莊永裕(Yung-Yu Chuang)
dc.contributor.authorYen-Kai Fanen
dc.contributor.author范延愷zh_TW
dc.date.accessioned2022-11-23T08:56:41Z-
dc.date.available2022-02-21
dc.date.available2022-11-23T08:56:41Z-
dc.date.copyright2022-02-21
dc.date.issued2022
dc.date.submitted2022-01-18
dc.identifier.citation[1] D. Barath, J. Matas, and J. Noskova. Magsac: marginalizing sample consensus. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10197–10205, 2019. [2] D. Barath, J. Noskova, M. Ivashechkin, and J. Matas. Magsac++, a fast, reliable and accurate robust estimator. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1304–1312, 2020. [3] H. Bay, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. In European conference on computer vision, pages 404–417. Springer, 2006. [4] C.-H. Chang, Y. Sato, and Y.-Y. Chuang. Shape-preserving half-projective warps for image stitching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3254–3261, 2014. [5] S.-Y. Chen and Y.-Y. Chuang. Deep exposure fusion with deghosting via homography estimation and attention learning. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1464– 1468. IEEE, 2020. [6] T. Cohen and M. Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016. [7] D. DeTone, T. Malisiewicz, and A. Rabinovich. Deep image homography estimation. arXiv preprint arXiv:1606.03798, 2016. [8] J. Engel, T. Schöps, and D. Cremers. Lsd-slam: Large-scale direct monocular slam. In European conference on computer vision, pages 834–849. Springer, 2014. [9] M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981. [10] C. Godard, K. Matzen, and M. Uyttendaele. Deep burst denoising. In Proceedings of the European Conference on Computer Vision (ECCV), pages 538–554, 2018. [11] T.-W. Hui, X. Tang, and C. C. Loy. Liteflownet: A lightweight convolutional neural network for optical flow estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8981–8989, 2018. [12] J. Hur and S. Roth. Iterative residual refinement for joint optical flow and occlusion estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5754–5763, 2019. [13] N. K. Kalantari, R. Ramamoorthi, et al. Deep high dynamic range imaging of dynamic scenes. ACM Trans. Graph., 36(4):144–1, 2017. [14] D. Koguciuk, E. Arani, and B. Zonooz. Perceptual loss for robust unsupervised homography estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4274–4283, 2021. [15] H. Le, F. Liu, S. Zhang, and A. Agarwala. Deep homography estimation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7652–7661, 2020. [16] S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshflow: Minimum latency online video stabilization. In European Conference on Computer Vision, pages 800–815. Springer, 2016. [17] P. C. Ng and S. Henikoff. Sift: Predicting amino acid changes that affect protein function. Nucleic acids research, 31(13):3812–3814, 2003. [18] T. Nguyen, S. W. Chen, S. S. Shivakumar, C. J. Taylor, and V. Kumar. Unsupervised deep homography: A fast and robust homography estimation model. IEEE Robotics and Automation Letters, 3(3):2346–2353, 2018. [19] L. Nie, C. Lin, K. Liao, M. Liu, and Y. Zhao. A view-free image stitching network based on global homography. Journal of Visual Communication and Image Representation, 73:102950, 2020. [20] L. Nie, C. Lin, K. Liao, M. Liu, and Y. Zhao. A view-free image stitching network based on global homography. Journal of Visual Communication and Image Representation, 73:102950, 2020. [21] L. Nie, C. Lin, K. Liao, S. Liu, and Y. Zhao. Depth-aware multi-grid deep homography estimation with contextual correlation. arXiv preprint arXiv:2107.02524, 2021. [22] L. Nie, C. Lin, K. Liao, and Y. Zhao. Learning edge-preserved image stitching from large-baseline deep homography. arXiv preprint arXiv:2012.06194, 2020. [23] A. Ranjan and M. J. Black. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4161–4170, 2017. [24] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. Orb: An efficient alternative to sift or surf. In 2011 International conference on computer vision, pages 2564–2571. Ieee, 2011. [25] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8934–8943, 2018. [26] Z. Teed and J. Deng. Raft: Recurrent all-pairs field transforms for optical flow. In European conference on computer vision, pages 402–419. Springer, 2020. [27] Y. Tian, X. Yu, B. Fan, F. Wu, H. Heijnen, and V. Balntas. Sosnet: Second order similarity regularization for local descriptor learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11016–11025, 2019. [28] M. Wang, G.-Y. Yang, J.-K. Lin, S.-H. Zhang, A. Shamir, S.-P. Lu, and S.-M. Hu. Deep online video stabilization with multi-grid warping transformation learning. IEEE Transactions on Image Processing, 28(5):2283–2292, 2018. [29] M. Weiler and G. Cesa. General e(2)-equivariant steerable cnns. In Conference on Neural Information Processing Systems (NeurIPS), 2019. [30] D. E. Worrall, S. J. Garbin, D. Turmukhambetov, and G. J. Brostow. Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5028–5037, 2017. [31] G. Yang and D. Ramanan. Volumetric correspondence networks for optical flow Advances in neural information processing systems, 32:794–805, 2019. [32] N. Ye, C. Wang, H. Fan, and S. Liu. Motion basis learning for unsupervised deep homography estimation with subspace projection. arXiv preprint arXiv:2103.15346, 2021. [33] K. M. Yi, E. Trulls, V. Lepetit, and P. Fua. Lift: Learned invariant feature transform. In European conference on computer vision, pages 467–483. Springer, 2016. [34] J. Zhang, C. Wang, S. Liu, L. Jia, N. Ye, J. Wang, J. Zhou, and J. Sun. Content-aware unsupervised deep homography estimation. In European Conference on Computer Vision, pages 653–669. Springer, 2020. [35] Z. Zhang. A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence, 22(11):1330–1334, 2000. [36] Y. Zhao, X. Huang, and Z. Zhang. Deep lucas-kanade homography for multimodal image alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15950–15959, 2021.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79250-
dc.description.abstract在電腦視覺領域中,單應性矩陣估計是一種基礎的圖像對齊演算法。有賴於深度特徵,近年來基於深度學習的單應性矩陣估計得以在許多困難的情境中取得比以往更好的成果。然而,這些基於深度學習的方法同時也失去了一些性質,例如具有等變性的特徵以及異常值檢測的能力。而失去這些性質會使得對齊影像的表現變差,尤其是對於那些有前景物件的影像。在這篇論文中,我們針對單應性矩陣估計提出了一個具有旋轉等變特徵以及異常值檢測的深度學習架構。我們設計的架構會利用旋轉等變特徵建構出多個代表各角度的四維代價容量,並且在遞迴的架構中加入自適應力機制,利用關係一致性學習如何忽略前景物體。實驗結果證明我們的架構可以準確的估算單應性矩陣,並且對於有前景物件的場景具有更好的魯棒性。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-23T08:56:41Z (GMT). No. of bitstreams: 1
U0001-1401202210281600.pdf: 3351595 bytes, checksum: 8e2cb6e8765352a888716d31236daaf5 (MD5)
Previous issue date: 2022
en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Denotation xv Chapter 1 Introduction 1 1.1 Feature-based approaches . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Direct approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Our solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chapter 2 Related works 5 2.1 Homography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Feature-based approaches. . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Direct approaches. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Iterative refinement for image alignment . . . . . . . . . . . . . . . 7 2.3 Equivariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Rotation-equivariant convolutional neural networks . . . . . . . . . . 8 Chapter 3 Method 9 3.1 Full rotation-equivariant correlation volume . . . . . . . . . . . . . . 10 3.1.1 Rotation-equivariant backbone . . . . . . . . . . . . . . . . . . . . 11 3.1.2 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.3 Relationship measurement . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Lookup function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Iterative updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.4 Objective function . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 4 Experiment 19 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3 Quantitative comparison . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.1 Warped MS-COCO . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.2 Warp MS-COCO-R . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.3 VidSetd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Qualitative comparison . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.4.1 UDIS-D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Chapter 5 Conclusions 27 References 29
dc.language.isozh-TW
dc.title基於遞迴相似性匹配的穩健單應性矩陣估計zh_TW
dc.titleRecurrent Robust Correspondence Matching for Homography Estimationen
dc.date.schoolyear110-1
dc.description.degree碩士
dc.contributor.oralexamcommittee葉正聖(Chih-Ting Lin),吳賦哲(Tien-Kan Chung),(Yu-Chieh Cheng)
dc.subject.keyword電腦視覺,深度學習,單應性矩陣估計,自注意力機制,相似性匹配,zh_TW
dc.subject.keywordcomputer vision,deep learning,homography estimation,self-attention,correspondence matching,en
dc.relation.page33
dc.identifier.doi10.6342/NTU202200060
dc.rights.note同意授權(全球公開)
dc.date.accepted2022-01-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
U0001-1401202210281600.pdf3.27 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved