幾何感知表示學習用於非監督式單眼深度估計

Chih-Hsuan Lo; 羅志軒

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66482

標題:	幾何感知表示學習用於非監督式單眼深度估計 Geometry-Aware Representation Learning For Unsupervised Monocular Depth Estimation
作者:	Chih-Hsuan Lo 羅志軒
指導教授:	陳良基(Liang-Gee Chen)
共同指導教授:	王鈺強(Yu-Chiang Wang)
關鍵字:	場景理解,單眼深度預測,語義分割,領域自適應,多任務學習,非監督學習,表示學習, scene understanding,monocular depth estimation,semantic segmentation,domain adaptation,multi-task learning,unsupervised learning,representation learning,
出版年 :	2019
學位:	碩士
摘要:	對於場景理解來說，單眼深度預測(monocular depth estimation)是一個重要的判斷依據，雖然目前大量的監督式和非監督式機器學習方法被提出，並在單眼深度預測上取得長足的進展，但通常大部分的方法在物體邊界以及細節上無法獲得很好的結果，而這些部份的深度資訊在生活應用上卻是相對重要部份。在這篇論文當中，我們提出一個全新的「幾何結構表示學習方法」 (geometry-aware representation learning)，透過加入語意分割的資訊將物體幾何結構納入單眼深度預測中，搭配上一系列的特殊條件判別器，用於統整物體結構和視覺外觀，最終有效幫助非監督式單眼深度預測，改善之前大部分方法於物體邊界和細節上不準確的問題。透過在公開資料集上定量和定性分析，證明我們的表示學習方法在非監督式單眼深度預測上比肩於目前其餘最先進方法的結果，並於特定物體上取得明顯的進步。 Monocular depth estimation plays an important role in scene understanding. While a number of supervised and unsupervised learning approaches for monocular depth estimation have been proposed, their promising quantitative performance might not necessarily reflect satisfactory quality of the depth outputs due to inaccurate object boundaries. In this paper, we propose a novel approach of geometry-aware representation learning, which takes object geometry into account with the aid of semantic scene understanding (ie semantic segmentation). With a series of unique combinatorial conditional discriminators deployed over visual appearance and geometric representations, improved unsupervised depth estimation can be achieved. Experiments on the KITTI dataset successively verify that our model performs favorably against recent approaches.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66482
DOI:	10.6342/NTU202000321
全文授權:	有償授權
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	2.83 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。