請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93784完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 丁肇隆 | zh_TW |
| dc.contributor.advisor | Chao-Lung Ting | en |
| dc.contributor.author | 張容誠 | zh_TW |
| dc.contributor.author | Jung-Cheng Chang | en |
| dc.date.accessioned | 2024-08-08T16:11:38Z | - |
| dc.date.available | 2024-08-09 | - |
| dc.date.copyright | 2024-08-08 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-07-31 | - |
| dc.identifier.citation | L. Wang, W. Yu and B. Li, “Multi-scenes Image Stitching Based on Autonomous Driving,”in: Proceedings of the IEEE 4th Information Technology,Networking,Electronic and Automation Control Conference (ITNEC), 2020.
D. DeTone, T. Malisiewicz and A. Rabinovich, "Deep Image Homography Estimation," in: Proceedings of the RSS Workshop: Limits Potentials Deep Learn. Robot, 2016. M. A. Fischler and R. C. Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography," Commun. ACM, 24(6):381–395, 1981. S. Baker and I. Matthews, “Lucas-Kanade 20 years on: A Unifying Framework,” International Journal of Computer Vision, vol. 56, no. 3, pp. 221–255, Feb 2004, doi: 10.1023/b:visi.0000011205.11775.fd. T. Nguyen, S. W. Chen, and S. S. Shivakumar, "Unsupervised Deep Homography: A Fast and Robust Homography Estimation Model," IEEE Robotics and Automation Letters, p. 2346–2353, 2018. D. G. Lowe, “Object Recognition from Local Scale-invariant Features,”in: Proceedings of the seventh IEEE international conference on computer vision, p. 1150–1157, 1999. Rosten, Edward and T. Drummond, “Machine Learning for High-Speed Corner Detection,” in: Proceedings of European Conference on Computer Vision, 2006. C. Harris and M. Stephens,, "A Combined Corner and Edge Detector," in: Proc. Alvey Vision Conference, vol. 15, Manchester, UK, 1988, pp. 10–5244. M. Calonder, V. Lepetit, C. Strecha, and P. Fua, "BRIEF: Binary Robust Independent Elementary Features," in: Proceedings of European Conference on Computer Vision, 2010. E. Rublee, V. Rabaud, K. Konolige and G. Bradski, “ORB:An Efficient Alternative to SIFT or SURF,”in: Proceedings of IEEE International Conference on Computer Vision, 2011. J. M. Keller, M. R. Gray and J. A. Givens, “A Fuzzy K-nearest Neighbor Algorithm,” Systems Man and Cybernetics IEEE Transactions on no. 4, pp. 580-585, 1985. Abdel-Aziz, Y.I., Karara and H.M., “Direct Linear Transformation from Comparator Coordinates into Object Space Coordinates in Close-range Photogrammetry,” in: Proceedings of the Symposium on Close-Range Photogrammetry, University of Illinois at Urbana-Champaign, Champaign, IL, USA, pp. 1-18, 1971. V. Klema and A. Laub, “The Singular Value Decomposition: Its Computation and Some Applications,” IEEE Transactions on Automatic Control, 1980. G. S. Watson, “Linear Least Squares Regression,” Ann. Math. Statist. 38 (6) 1679 - 1699, December, 1967. D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self-supervised Interest Point Detection,” In Proc. CVPRW, pages 224–236, 2018. P. Sarlin, D. DeTone, T. Malisiewicz and A. Rabinovich, “SuperGlue: Learning Feature Matching with Graph Neural Networks,” in CVPR, 2020. L. Zhang, T. Wen and J. Shi, “Deep Image Blending,” in: Winter Conference on Applications of Computer Vision (WACV), 2020. J. Zhang, C. Wang, S. Liu, L. Jia, N. Ye, J. Wang, J. Zhou and J. Sun, “Content-Aware Unsupervised Deep Homography Estimation,” in: Proc. ECCV pages 653–669, 2020. D. Koguciuk, E. Arani, and B. Zonooz, “Perceptual Loss for Robust Unsupervised Homography Estimation,” in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 4274–4283. L. Nie, C. Lin, K. Liao, and Y. Zhao, “ Learning Edge-preserved Image Stitching from Multi-scale Deep Homography,” Neurocomputing, 491:533–543, 2022. R. Zeng, S. Denman, S. Sridharan, and Clinton, “Rethinking Planar Homography Estimation Using Perspective Fields,” in: Proc. Asian Conference on Computer Vision, pages 571–586. Springer, 2018. M. Hong, Y. Lu, N. Ye, C. Lin, Q. Zhao and S. Liu, “Unsupervised Homography Estimation with Coplanarity-Aware GAN,” in: Computer Vision and Pattern Recognition, 2022. R.A. Haddad and A.N. Akansu, “A Class of Fast Gaussian Binomial Filters for Speech and Image Processing,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 39, pp 723–727, 1991. C. Tomasi and R. Manduchi, “Bilateral Filtering for Gray and Color Images,”in: Proc. IEEE International Conference on Computer Vision, pp. 839-846, 1998. S. Woo, J. Park, J. Lee, and I. S. Kweon, “CBAM: Convolutional Block Attention Module,” in: ECCV, 2018. A. F. Agarap, “Deep Learning Using Rectified Linear Units (relu),” arxiv preprint arxiv:1803.08375, 2018. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in: NIPS, pp. 1106–1114, 2012. N. Kanopoulos, N. Vasanthavada and R.L. Baker, “Design of An Image Edge Detection Filter Using the Sobel Operator,” IEEE, 1988. D. Ulyanov, A. Vedaldi and V. Lempitsky, “Instance Normalization: The Missing Ingredient for Fast Stylization,” arXiv:1607.08022, 2017. P. J. Burt and E. H. Adelson, “A Multiresolution Spline with Application to Image Mosaics,” ACM Transactions on Graphics, 1983. J. Kopf, M. Cohen, D. Lischinski, and M. Uyttendaele, “Joint Bilateral Upsampling,” In ACM Transactions on Graphics (TOG), p. 96, 2007. F. Durand and J. Dorsey, “Fast Bilateral Filtering for the Display of High Dynamic-range,” ACM Transactions on Graphics, 2002. W. Zhou, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, “Image Quality Assessment: from Error Visibility to Structural Similarity,” IEEE Transactions on Image Processing, 2004. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, et al, “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in: Advances in Neural Information Processing Systems 32, pp. 8024-8035, 2019. D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980, 2014. K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” in: Proceedings of the IEEE international conference on computer vision, 2015. M. Shao, T. Tasdizen and S. Joshi, “Domain Shift Immunity of Deep Homography Estimation,” in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision(WACV), 2024. I. Avcibas, B. Sankur and K. Sayood, “Statistical Evaluation of Image Quality Measures,” Journal of Electronic Imaging, vol. 11, no. 2, pp. 206-223, 2002. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93784 | - |
| dc.description.abstract | 影像拼接(Image stitching)被廣泛運用在無人自駕車上以提升相機於視野盲區的物體偵測與辨識成功率,能否即時運行並保留完整的拼接結果在此任務中相當重要。過去影像拼接透過偵測與配對特徵點以計算出單應性矩陣(Homography),並進行重疊區域影像的疊加混合以取得最終拼接結果,但缺乏效率。以非監督式學習的方式儘管擁有較快的單應性矩陣預測速度,但矩陣準確度仍有待提升。為解決此問題,本研究提出了新的模型架構,在UDHN的基礎上引入卷積注意力機制,並結合多層位移區塊逐步對影像角落位置進行預測。我們總共產生了18000對256×256大小的訓練資料和各4000對的驗證與測試資料。本研究提出的方法在保留矩陣運算高效性的同時,顯著降低了影像角落位移的均方根誤差,相較於CA-Homo和UDHN分別降低了27.44 % 和75.06 %。此外,本研究採用多階段學習策略,讓模型從較簡單的資料訓練到較複雜的資料,以提高預測準確度並提升訓練效率。另外,本研究提出了基於JBF的影像混合方法,實驗結果顯示,該混合方式在峰值信噪比(PSNR)上較一般混合方式提升了13%。最終,通過消融實驗表明,所提出的模型架構與各個損失函數在整體訓練中,皆起到關鍵作用。總體來說,本研究所使用的模型在提升影像拼接效果的同時,也維持了模型既有的預測速度,讓此模型可應用於即時任務上。 | zh_TW |
| dc.description.abstract | Image stitching has been widely used in autonomous vehicles to improve the success rate of object detection and recognition in blind spots. One of the challenges is to process in real-time while retaining the complete stitching result. Traditional methods detect and match the keypoints in input images to obtain the homography, which lack efficiency. Supervised and unsupervised learning methods offer faster homography estimation, but the accuracy still needs improvement. To address this issue, a new model architecture is proposed, which is built upon unsupervised deep homography network (UDHN) by incorporating convolutional block attention module (CBAM) and the multi-layer corner shifts to progressively predict the final shift. In this study, 18000 pairs of 256×256 images were generated for training with 4000 pairs individually for validation and testing. The root mean square error (RMSE) of corner shift in the research was reduced by 27.44% and 75.06% compared to content-aware unsupervised deep homography estimation (CA-Homo) and UDHN, respectively. Additionally, a multistage learning approach was proposed to train the model with simpler data first, then gradually progress to more complex data to enhance the prediction accuracy and training efficiency. Moreover, a new blending method was introduced and showed a 13% improvement on PSNR over the normal blending method. Finally, the ablation experiments were conducted and validated that each component of the model architecture and loss term play a crucial role in the overall process. To summarize, our approach improved image stitching effectiveness while maintaining the prediction speed, making it suitable for real-time applications. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-08T16:11:38Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-08-08T16:11:38Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 誌謝 i
中文摘要 ii Abstract iii List of Figures vii List of Tables x List of Abbreviations xi Chapter 1 Introduction 1 1.1 Motivation 2 1.2 Structure overview 3 Chapter 2 Related work 4 2.1 Traditional methods 4 2.2 Supervised learning methods 6 2.3 Unsupervised learning methods 7 Chapter 3 Methodology 10 3.1 Data Preparation 10 3.1.1 Dataset generation 10 3.1.2 Data preprocessing 11 3.1.3 Data normalization 12 3.2 Protocols 13 3.3 Model architecture 14 3.3.1 CBAM 15 3.3.2 Multilayer of corner shifts 17 3.4 Loss 18 3.4.1 Content loss 18 3.4.2 Edge loss 19 3.4.3 Correlation loss 20 3.4.4 Total loss 21 3.5 Image blending 22 3.5.1 Normal blending 22 3.5.2 Multiband blending 23 3.5.3 Joint bilateral filter blending 24 3.6 Total procedure 25 Chapter 4 Results and Discussion 28 4.1 Evaluation metrics 28 4.1.1 Corner RMSE 28 4.1.2 PSNR 28 4.1.3 SSIM 29 4.2 Environment setting 31 4.2.1 Hardware and software 31 4.2.2 Initial setting and hyperparameters 31 4.3 Performance Analyses 32 4.3.1 Comparison with traditional methods 33 4.3.2 Comparison with deep learning methods 36 4.3.3 Blending Comparison 40 4.4 Ablation study 43 4.4.1 CBAM mechanism 43 4.4.2 Multilayer of corner shifts 45 4.4.3 Loss function comparison 50 4.4.4 Training protocol comparison 52 Chapter 5 Conclusion 56 References 58 | - |
| dc.language.iso | en | - |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 影像拼接 | zh_TW |
| dc.subject | 影像混合 | zh_TW |
| dc.subject | 影像對準 | zh_TW |
| dc.subject | 單應性矩陣預測 | zh_TW |
| dc.subject | Image stitching | en |
| dc.subject | Homography estimation | en |
| dc.subject | Image alignment | en |
| dc.subject | Image blending | en |
| dc.subject | Deep learning | en |
| dc.title | 基於卷積注意力機制非監督式學習之影像拼接 | zh_TW |
| dc.title | Image Stitching Based on Unsupervised Learning with a Convolutional Block Attention Module | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.coadvisor | 張恆華 | zh_TW |
| dc.contributor.coadvisor | Herng-Hua Chang | en |
| dc.contributor.oralexamcommittee | 黃乾綱;陳昭宏;謝傳璋 | zh_TW |
| dc.contributor.oralexamcommittee | Chien-Kang Huang;Jau-Horng Chen;Chuan-Cheung Tse | en |
| dc.subject.keyword | 影像拼接,單應性矩陣預測,影像對準,影像混合,深度學習, | zh_TW |
| dc.subject.keyword | Image stitching,Homography estimation,Image alignment,Image blending,Deep learning, | en |
| dc.relation.page | 61 | - |
| dc.identifier.doi | 10.6342/NTU202401256 | - |
| dc.rights.note | 同意授權(限校園內公開) | - |
| dc.date.accepted | 2024-08-02 | - |
| dc.contributor.author-college | 工學院 | - |
| dc.contributor.author-dept | 工程科學及海洋工程學系 | - |
| dc.date.embargo-lift | 2029-07-31 | - |
| 顯示於系所單位: | 工程科學及海洋工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 3.65 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
