基於遞迴相似性匹配的穩健單應性矩陣估計

Yen-Kai Fan; 范延愷

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79250

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊永裕(Yung-Yu Chuang)
dc.contributor.author	Yen-Kai Fan	en
dc.contributor.author	范延愷	zh_TW
dc.date.accessioned	2022-11-23T08:56:41Z	-
dc.date.available	2022-02-21
dc.date.available	2022-11-23T08:56:41Z	-
dc.date.copyright	2022-02-21
dc.date.issued	2022
dc.date.submitted	2022-01-18
dc.identifier.citation	[1] D. Barath, J. Matas, and J. Noskova. Magsac: marginalizing sample consensus. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10197–10205, 2019. [2] D. Barath, J. Noskova, M. Ivashechkin, and J. Matas. Magsac++, a fast, reliable and accurate robust estimator. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1304–1312, 2020. [3] H. Bay, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. In European conference on computer vision, pages 404–417. Springer, 2006. [4] C.-H. Chang, Y. Sato, and Y.-Y. Chuang. Shape-preserving half-projective warps for image stitching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3254–3261, 2014. [5] S.-Y. Chen and Y.-Y. Chuang. Deep exposure fusion with deghosting via homography estimation and attention learning. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1464– 1468. IEEE, 2020. [6] T. Cohen and M. Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016. [7] D. DeTone, T. Malisiewicz, and A. Rabinovich. Deep image homography estimation. arXiv preprint arXiv:1606.03798, 2016. [8] J. Engel, T. Schöps, and D. Cremers. Lsd-slam: Large-scale direct monocular slam. In European conference on computer vision, pages 834–849. Springer, 2014. [9] M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981. [10] C. Godard, K. Matzen, and M. Uyttendaele. Deep burst denoising. In Proceedings of the European Conference on Computer Vision (ECCV), pages 538–554, 2018. [11] T.-W. Hui, X. Tang, and C. C. Loy. Liteflownet: A lightweight convolutional neural network for optical flow estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8981–8989, 2018. [12] J. Hur and S. Roth. Iterative residual refinement for joint optical flow and occlusion estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5754–5763, 2019. [13] N. K. Kalantari, R. Ramamoorthi, et al. Deep high dynamic range imaging of dynamic scenes. ACM Trans. Graph., 36(4):144–1, 2017. [14] D. Koguciuk, E. Arani, and B. Zonooz. Perceptual loss for robust unsupervised homography estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4274–4283, 2021. [15] H. Le, F. Liu, S. Zhang, and A. Agarwala. Deep homography estimation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7652–7661, 2020. [16] S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshflow: Minimum latency online video stabilization. In European Conference on Computer Vision, pages 800–815. Springer, 2016. [17] P. C. Ng and S. Henikoff. Sift: Predicting amino acid changes that affect protein function. Nucleic acids research, 31(13):3812–3814, 2003. [18] T. Nguyen, S. W. Chen, S. S. Shivakumar, C. J. Taylor, and V. Kumar. Unsupervised deep homography: A fast and robust homography estimation model. IEEE Robotics and Automation Letters, 3(3):2346–2353, 2018. [19] L. Nie, C. Lin, K. Liao, M. Liu, and Y. Zhao. A view-free image stitching network based on global homography. Journal of Visual Communication and Image Representation, 73:102950, 2020. [20] L. Nie, C. Lin, K. Liao, M. Liu, and Y. Zhao. A view-free image stitching network based on global homography. Journal of Visual Communication and Image Representation, 73:102950, 2020. [21] L. Nie, C. Lin, K. Liao, S. Liu, and Y. Zhao. Depth-aware multi-grid deep homography estimation with contextual correlation. arXiv preprint arXiv:2107.02524, 2021. [22] L. Nie, C. Lin, K. Liao, and Y. Zhao. Learning edge-preserved image stitching from large-baseline deep homography. arXiv preprint arXiv:2012.06194, 2020. [23] A. Ranjan and M. J. Black. Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4161–4170, 2017. [24] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. Orb: An efficient alternative to sift or surf. In 2011 International conference on computer vision, pages 2564–2571. Ieee, 2011. [25] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8934–8943, 2018. [26] Z. Teed and J. Deng. Raft: Recurrent all-pairs field transforms for optical flow. In European conference on computer vision, pages 402–419. Springer, 2020. [27] Y. Tian, X. Yu, B. Fan, F. Wu, H. Heijnen, and V. Balntas. Sosnet: Second order similarity regularization for local descriptor learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11016–11025, 2019. [28] M. Wang, G.-Y. Yang, J.-K. Lin, S.-H. Zhang, A. Shamir, S.-P. Lu, and S.-M. Hu. Deep online video stabilization with multi-grid warping transformation learning. IEEE Transactions on Image Processing, 28(5):2283–2292, 2018. [29] M. Weiler and G. Cesa. General e(2)-equivariant steerable cnns. In Conference on Neural Information Processing Systems (NeurIPS), 2019. [30] D. E. Worrall, S. J. Garbin, D. Turmukhambetov, and G. J. Brostow. Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5028–5037, 2017. [31] G. Yang and D. Ramanan. Volumetric correspondence networks for optical flow Advances in neural information processing systems, 32:794–805, 2019. [32] N. Ye, C. Wang, H. Fan, and S. Liu. Motion basis learning for unsupervised deep homography estimation with subspace projection. arXiv preprint arXiv:2103.15346, 2021. [33] K. M. Yi, E. Trulls, V. Lepetit, and P. Fua. Lift: Learned invariant feature transform. In European conference on computer vision, pages 467–483. Springer, 2016. [34] J. Zhang, C. Wang, S. Liu, L. Jia, N. Ye, J. Wang, J. Zhou, and J. Sun. Content-aware unsupervised deep homography estimation. In European Conference on Computer Vision, pages 653–669. Springer, 2020. [35] Z. Zhang. A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence, 22(11):1330–1334, 2000. [36] Y. Zhao, X. Huang, and Z. Zhang. Deep lucas-kanade homography for multimodal image alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15950–15959, 2021.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79250	-
dc.description.abstract	在電腦視覺領域中，單應性矩陣估計是一種基礎的圖像對齊演算法。有賴於深度特徵，近年來基於深度學習的單應性矩陣估計得以在許多困難的情境中取得比以往更好的成果。然而，這些基於深度學習的方法同時也失去了一些性質，例如具有等變性的特徵以及異常值檢測的能力。而失去這些性質會使得對齊影像的表現變差，尤其是對於那些有前景物件的影像。在這篇論文中，我們針對單應性矩陣估計提出了一個具有旋轉等變特徵以及異常值檢測的深度學習架構。我們設計的架構會利用旋轉等變特徵建構出多個代表各角度的四維代價容量，並且在遞迴的架構中加入自適應力機制，利用關係一致性學習如何忽略前景物體。實驗結果證明我們的架構可以準確的估算單應性矩陣，並且對於有前景物件的場景具有更好的魯棒性。	zh_TW
dc.description.provenance	Made available in DSpace on 2022-11-23T08:56:41Z (GMT). No. of bitstreams: 1 U0001-1401202210281600.pdf: 3351595 bytes, checksum: 8e2cb6e8765352a888716d31236daaf5 (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Denotation xv Chapter 1 Introduction 1 1.1 Feature-based approaches . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Direct approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Our solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chapter 2 Related works 5 2.1 Homography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Feature-based approaches. . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Direct approaches. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Iterative refinement for image alignment . . . . . . . . . . . . . . . 7 2.3 Equivariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Rotation-equivariant convolutional neural networks . . . . . . . . . . 8 Chapter 3 Method 9 3.1 Full rotation-equivariant correlation volume . . . . . . . . . . . . . . 10 3.1.1 Rotation-equivariant backbone . . . . . . . . . . . . . . . . . . . . 11 3.1.2 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.3 Relationship measurement . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Lookup function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Iterative updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.4 Objective function . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 4 Experiment 19 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3 Quantitative comparison . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.1 Warped MS-COCO . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.2 Warp MS-COCO-R . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.3 VidSetd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Qualitative comparison . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.4.1 UDIS-D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Chapter 5 Conclusions 27 References 29
dc.language.iso	zh-TW
dc.title	基於遞迴相似性匹配的穩健單應性矩陣估計	zh_TW
dc.title	Recurrent Robust Correspondence Matching for Homography Estimation	en
dc.date.schoolyear	110-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	葉正聖(Chih-Ting Lin),吳賦哲(Tien-Kan Chung),(Yu-Chieh Cheng)
dc.subject.keyword	電腦視覺,深度學習,單應性矩陣估計,自注意力機制,相似性匹配,	zh_TW
dc.subject.keyword	computer vision,deep learning,homography estimation,self-attention,correspondence matching,	en
dc.relation.page	33
dc.identifier.doi	10.6342/NTU202200060
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2022-01-19
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
U0001-1401202210281600.pdf	3.27 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。