請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66482
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 陳良基(Liang-Gee Chen) | |
dc.contributor.author | Chih-Hsuan Lo | en |
dc.contributor.author | 羅志軒 | zh_TW |
dc.date.accessioned | 2021-06-17T00:38:20Z | - |
dc.date.available | 2021-02-17 | |
dc.date.copyright | 2020-02-17 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2020-02-06 | |
dc.identifier.citation | [1] H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, Deepordinal regression network for monocular depth estimation,' 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,pp.2002-2011, 2018.
[2] D. Eigen, C. Puhrsch, and R. Fergus, Depth map prediction from a single image using a multi-scale deep network,' in Advances in neural information processing systems, 2014, pp.2366-2374. [3] F. Liu, C. Shen, G. Lin, and I. Reid, Learning depth from single monocular images using deep convolutional neural fields,' IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 10, pp.2024-2039, 2015. [4] C. Godard, O. Mac Aodha, and G. J. Brostow, Unsupervised monocular depth estimation with left-right consistency,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017,pp. 270-279. [5] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, Unsupervised learning of depth and ego-motion from video,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp.1851-1858 [6] P.-Y. Chen, A. H. Liu, Y.-C. Liu, and Y.-C. F. Wang, Towards scene understanding: Unsupervised monocular depth estimation with semantic-aware representation,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp.2624-2632. [7] A. Atapour-Abarghouei and T. P. Breckon, Veritatem dies aperit temporally consistent depth prediction enabled by a multi-task geometric and semantic scene understanding approach,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3373-3384. [8] Z. Zhang, Z. Cui, C. Xu, Z. Jie, X. Li, and J. Yang, Joint task recursive learning for semantic segmentation and depth estimation,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 235-251. [9] Z. Zhang, Z. Cui, C. Xu, Y. Yan, N. Sebe, and J. Yang, Pattern-affinitive propagation across depth, surface normal and semantic segmentation,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 4106-4115. [10] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets,' in Advances in neural information processing systems, 2014, pp. 2672-2680. [11] D. Scharstein and R. Szeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,' International journal of computer vision, vol. 47, no. 1-3, pp. 7-42, 2002. [12] C. Wu et al., Visualsfm: A visual structure from motion system,'2011. [13] B. Li, C. Shen, Y. Dai, A. Van Den Hengel, and M. He, Depth and surface normal estimation from monocular images using regression on deep features and hierarchical crfs,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1119-1127. [14] D. Xu, E. Ricci, W. Ouyang, X. Wang, and N. Sebe, Multi-scale continuous crfs as sequential deep networks for monocular depth estimation,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5354-5362. [15] R. Mahjourian, M. Wicke, and A. Angelova, Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5667-5675. [16] Z. Yin and J. Shi, Geonet: Unsupervised learning of dense depth, optical flow and camera pose,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1983-1992. [17] Z. Yang, P. Wang, Y. Wang, W. Xu, and R. Nevatia, Lego: Learning edge with geometry all at once by watching videos,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 225-234. [18] Y. Kuznietsov, J. Stuckler, and B. Leibe, Semi-supervised deep learning for monocular depth map prediction,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 6647-6655. [19] A. CS Kumar, S. M. Bhandarkar, and M. Prasad, Monocular depth prediction using generative adversarial networks,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 300-308. [20] F. Aleotti, F. Tosi, M. Poggi, and S. Mattoccia, Generative adversarial networks for unsupervised monocular depth prediction,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp.0-0. [21] A. Pilzer, D. Xu, M. Puscas, E. Ricci, and N. Sebe, Unsupervised adversarial depth estimation using cycled generative networks,' in 2018 International Conference on 3D Vision (3DV). IEEE, 2018, pp. 587-595. [22] R. Chen, F. Mahmood, A. Yuille, and N. J. Durr, Rethinking monocular depth estimation with adversarial training,' arXiv preprint arXiv:1808.07528, 2018. [23] M. Mirza and S. Osindero, Conditional generative adversarial nets,' arXiv preprint arXiv:1411.1784, 2014. [24] J. Nath Kundu, P. Krishna Uppala, A. Pahuja, and R. Venkatesh Babu, Adadepth: Unsupervised content congruent adaptation for depth estimation,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2656-2665. [25] A. Mousavian, H. Pirsiavash, and J. Ko seck a, Joint semantic segmentation and depth estimation with deep convolutional networks,' in 2016 Fourth International Conference on 3D Vision (3DV). IEEE, 2016, pp. 611-619. [26] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The kitti dataset,' The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013. [27] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, The cityscapes dataset for semantic urban scene understanding,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3213-3223. [28] G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3234-3243. [29] Z. Yang, P. Wang, W. Xu, L. Zhao, and R. Nevatia, Unsupervised learning of geometry from videos with edge-aware depth-normal consistency,' in Thirty-Second AAAI Conference on Arti cial Intelligence, 2018. [30] E. Tzeng, J. Ho man, K. Saenko, and T. Darrell, Adversarial discriminative domain adaptation,' in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [31] A. Ranjan, V. Jampani, L. Balles, K. Kim, D. Sun, J. Wul , and M. J. Black, Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 12240-12249. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66482 | - |
dc.description.abstract | 對於場景理解來說,單眼深度預測(monocular depth estimation)是一個重要的判斷依據,雖然目前大量的監督式和非監督式機器學習方法被提出,並在單眼深度預測上取得長足的進展,但通常大部分的方法在物體邊界以及細節上無法獲得很好的結果,而這些部份的深度資訊在生活應用上卻是相對重要部份。
在這篇論文當中,我們提出一個全新的「幾何結構表示學習方法」 (geometry-aware representation learning),透過加入語意分割的資訊將物體幾何結構納入單眼深度預測中,搭配上一系列的特殊條件判別器,用於統整物體結構和視覺外觀,最終有效幫助非監督式單眼深度預測,改善之前大部分方法於物體邊界和細節上不準確的問題。 透過在公開資料集上定量和定性分析,證明我們的表示學習方法在非監督 式單眼深度預測上比肩於目前其餘最先進方法的結果,並於特定物體上取得明顯的進步。 | zh_TW |
dc.description.abstract | Monocular depth estimation plays an important role in scene understanding. While a number of supervised and unsupervised learning approaches for monocular depth estimation have been proposed, their promising quantitative performance might not necessarily reflect satisfactory quality of the depth outputs due to inaccurate object boundaries. In this paper, we propose a novel approach of geometry-aware representation learning, which takes object geometry into account with the aid of semantic scene understanding (ie semantic segmentation). With a series of unique combinatorial conditional discriminators deployed over visual appearance and geometric representations, improved unsupervised depth estimation can be achieved. Experiments on the KITTI dataset successively verify that our model performs favorably against recent approaches. | en |
dc.description.provenance | Made available in DSpace on 2021-06-17T00:38:20Z (GMT). No. of bitstreams: 1 ntu-108-R06943020-1.pdf: 2901697 bytes, checksum: 961ddaf20bdd4645cb263631639ca370 (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | Abstract vii
1 Introduction 1 2 Background 5 2.1 Depth Estimation . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Adversarial Learning . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Multi-task Learning for Visual Anlaysis . . . . . . . . . . . . 7 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 The Proposed Method 9 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Joint Exploitation of Depth and Object Semantics for Representation Learning . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Geometry-Aware Representation Learning . . . . . . . . . . 13 3.4 Learning Across Data Domains . . . . . . . . . . . . . . . . 15 3.5 Full Objective . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4 Experiment 19 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . 20 4.3 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . 21 4.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.5 Qualitative Results . . . . . . . . . . . . . . . . . . . . . . . 25 5 Conclusion 27 Bibliography 28 | |
dc.language.iso | en | |
dc.title | 幾何感知表示學習用於非監督式單眼深度估計 | zh_TW |
dc.title | Geometry-Aware Representation Learning For Unsupervised Monocular Depth Estimation | en |
dc.type | Thesis | |
dc.date.schoolyear | 108-1 | |
dc.description.degree | 碩士 | |
dc.contributor.coadvisor | 王鈺強(Yu-Chiang Wang) | |
dc.contributor.oralexamcommittee | 陳駿丞(Jun-Cheng Chen),陳祝嵩(Chu-Song Chen) | |
dc.subject.keyword | 場景理解,單眼深度預測,語義分割,領域自適應,多任務學習,非監督學習,表示學習, | zh_TW |
dc.subject.keyword | scene understanding,monocular depth estimation,semantic segmentation,domain adaptation,multi-task learning,unsupervised learning,representation learning, | en |
dc.relation.page | 32 | |
dc.identifier.doi | 10.6342/NTU202000321 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2020-02-07 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電子工程學研究所 | zh_TW |
顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 2.83 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。