通過層級特徵融合增強無監督單眼深度估計

Xinyi Lin; 林心怡

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21490

標題:	通過層級特徵融合增強無監督單眼深度估計 Enhancing Unsupervised Monocular Depth Estimation via Fusing Layer-wised Features
作者:	Xinyi Lin 林心怡
指導教授:	洪一平
關鍵字:	深度估計,深度學習,無監督學習, depth estimation,deep learning,unsupervised learning,
出版年 :	2019
學位:	碩士
摘要:	最近，通過優化幀間的光度一致性，深度學習方法在單眼影片序列的深度估計和視覺測距中表現出良好的性能。然而，獲取大規模的深度圖真實值以用於監督神經網路仍然存在困難。受到近期語義分割深度學習方法的啟發，我們提出了一種簡單但有效的無監督學習深度網路，可用於更準確地進行深度估計和相機運動估計。一個空洞空間金字塔池化模塊和一個附加的細化層被組合到基礎的編碼器-解碼器模型之中。此外，我們引入了一項一致性正則化損失，以增加模型處理光照變化的魯棒性。實驗證明，我們的方法可以生成具有更清晰對象邊界的高解析度深度圖，並且在KITTI基準測試中獲得了良好的結果。 Recently, deep methods have shown good performance in depth estimation and Visual Odometry from monocular video sequence by optimizing the photometric consistency between frames. However, it remains hard to obtain large-scale ground truth depth maps for supervising a neural network for depth estimation. Meanwhile, existing solutions for depth estimation typically produce low resolution results. Inspired by recent deep learning methods for semantic segmentation, we present a simple but effective unsupervised learning deep network for more accurate depth estimation and camera motion estimation. An atrous spatial pyramid pooling module and an additional refinement layer are combined to an encoder-decoder base model. Besides, we introduce a consistency-regularization loss to increase the robustness towards handling illumination change. Our approach produces high-resolution depth maps with sharper object boundaries and achieve better results on the KITTI benchmark.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21490
DOI:	10.6342/NTU201902073
全文授權:	未授權
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	1.4 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。