Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工程科學及海洋工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93784
Title: 基於卷積注意力機制非監督式學習之影像拼接
Image Stitching Based on Unsupervised Learning with a Convolutional Block Attention Module
Authors: 張容誠
Jung-Cheng Chang
Advisor: 丁肇隆
Chao-Lung Ting
Co-Advisor: 張恆華
Herng-Hua Chang
Keyword: 影像拼接,單應性矩陣預測,影像對準,影像混合,深度學習,
Image stitching,Homography estimation,Image alignment,Image blending,Deep learning,
Publication Year : 2024
Degree: 碩士
Abstract: 影像拼接(Image stitching)被廣泛運用在無人自駕車上以提升相機於視野盲區的物體偵測與辨識成功率,能否即時運行並保留完整的拼接結果在此任務中相當重要。過去影像拼接透過偵測與配對特徵點以計算出單應性矩陣(Homography),並進行重疊區域影像的疊加混合以取得最終拼接結果,但缺乏效率。以非監督式學習的方式儘管擁有較快的單應性矩陣預測速度,但矩陣準確度仍有待提升。為解決此問題,本研究提出了新的模型架構,在UDHN的基礎上引入卷積注意力機制,並結合多層位移區塊逐步對影像角落位置進行預測。我們總共產生了18000對256×256大小的訓練資料和各4000對的驗證與測試資料。本研究提出的方法在保留矩陣運算高效性的同時,顯著降低了影像角落位移的均方根誤差,相較於CA-Homo和UDHN分別降低了27.44 % 和75.06 %。此外,本研究採用多階段學習策略,讓模型從較簡單的資料訓練到較複雜的資料,以提高預測準確度並提升訓練效率。另外,本研究提出了基於JBF的影像混合方法,實驗結果顯示,該混合方式在峰值信噪比(PSNR)上較一般混合方式提升了13%。最終,通過消融實驗表明,所提出的模型架構與各個損失函數在整體訓練中,皆起到關鍵作用。總體來說,本研究所使用的模型在提升影像拼接效果的同時,也維持了模型既有的預測速度,讓此模型可應用於即時任務上。
Image stitching has been widely used in autonomous vehicles to improve the success rate of object detection and recognition in blind spots. One of the challenges is to process in real-time while retaining the complete stitching result. Traditional methods detect and match the keypoints in input images to obtain the homography, which lack efficiency. Supervised and unsupervised learning methods offer faster homography estimation, but the accuracy still needs improvement. To address this issue, a new model architecture is proposed, which is built upon unsupervised deep homography network (UDHN) by incorporating convolutional block attention module (CBAM) and the multi-layer corner shifts to progressively predict the final shift. In this study, 18000 pairs of 256×256 images were generated for training with 4000 pairs individually for validation and testing. The root mean square error (RMSE) of corner shift in the research was reduced by 27.44% and 75.06% compared to content-aware unsupervised deep homography estimation (CA-Homo) and UDHN, respectively. Additionally, a multistage learning approach was proposed to train the model with simpler data first, then gradually progress to more complex data to enhance the prediction accuracy and training efficiency. Moreover, a new blending method was introduced and showed a 13% improvement on PSNR over the normal blending method. Finally, the ablation experiments were conducted and validated that each component of the model architecture and loss term play a crucial role in the overall process. To summarize, our approach improved image stitching effectiveness while maintaining the prediction speed, making it suitable for real-time applications.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93784
DOI: 10.6342/NTU202401256
Fulltext Rights: 同意授權(限校園內公開)
metadata.dc.date.embargo-lift: 2029-07-31
Appears in Collections:工程科學及海洋工程學系

Files in This Item:
File SizeFormat 
ntu-112-2.pdf
  Restricted Access
3.65 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved