Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71498
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳良基
dc.contributor.authorYI-TING SHENen
dc.contributor.author沈怡廷zh_TW
dc.date.accessioned2021-06-17T06:01:54Z-
dc.date.available2019-02-12
dc.date.copyright2019-02-12
dc.date.issued2019
dc.date.submitted2019-01-31
dc.identifier.citation[1] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.
[2] N. Mayer, E. Ilg, P. H ̈ausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016, arXiv:1512.02134. [Online]. Available: http://lmb. informatik.uni-freiburg.de/Publications/2016/MIFDB16
[3] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convo- lutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
[4] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015.
[5] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classifica- tion with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
[6] A. Graves, A.-r. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, 2013, pp. 6645–6649.
[7] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” Na- ture, vol. 518, no. 7540, p. 529, 2015.
[8] G. A. Bekey, Autonomous robots: from biological inspiration to imple- mentation and control. MIT press, 2005.
[9] R. Siegwart, I. R. Nourbakhsh, D. Scaramuzza, and R. C. Arkin, In- troduction to autonomous mobile robots. MIT press, 2011.
[10] C. Urmson and W. . Whittaker, “Self-driving cars and the urban chal- lenge,” IEEE Intelligent Systems, vol. 23, no. 2, pp. 66–68, March 2008.
[11] T. Tomic, K. Schmid, P. Lutz, A. Domel, M. Kassecker, E. Mair, I. L. Grixa, F. Ruess, M. Suppa, and D. Burschka, “Toward a fully au- tonomous uav: Research platform for indoor and outdoor urban search and rescue,” IEEE Robotics Automation Magazine, vol. 19, no. 3, pp. 46–56, Sep. 2012.
[12] P. Lottes, R. Khanna, J. Pfeifer, R. Siegwart, and C. Stachniss, “Uav- based crop and weed classification for smart farming,” in 2017 IEEE International Conference on Robotics and Automation (ICRA), May 2017, pp. 3024–3031.
[13] M.-R. Hsieh, Y.-L. Lin, and W. H. Hsu, “Drone-based object count- ing by spatially regularized regional proposal network,” in The IEEE International Conference on Computer Vision (ICCV), vol. 1, 2017.
[14] N. Hirose, A. Sadeghian, M. Va ́zquez, P. Goebel, and S. Savarese, “Gonet: A semi-supervised deep learning approach for traversability estimation,” arXiv preprint arXiv:1803.03254, 2018.
[15] A. Lubben, “Self-driving uber killed a pedestrian as hu- man safety driver watched,” Mar 2018, accessed: 2018-12- 21. [Online]. Available: https://news.vice.com/en us/article/kzxq3y/ self-driving-uber-killed-a-pedestrian-as-human-safety-driver-watched
[16] M. Poggi and S. Mattoccia, “A wearable mobility aid for the visually impaired based on embedded 3d vision and deep learning,” in 2016 IEEE Symposium on Computers and Communication (ISCC), June 2016, pp. 208–213.
[17] E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discrim- inative domain adaptation,” in Computer Vision and Pattern Recogni- tion (CVPR), vol. 1, no. 2, 2017, p. 4.
[18] J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. A. Efros, and T. Darrell, “Cycada: Cycle-consistent adversarial domain adaptation,” arXiv preprint arXiv:1711.03213, 2017.
[19] L. Heng, B. Choi, Z. Cui, M. Geppert, S. Hu, B. Kuan, P. Liu, R. Nguyen, Y. C. Yeo, A. Geiger, et al., “Project autovision: Lo- calization and 3d scene perception for an autonomous vehicle with a multi-camera system,” arXiv preprint arXiv:1809.05477, 2018.
[20] A. Suleiman, Z. Zhang, L. Carlone, S. Karaman, and V. Sze, “Navion: A 2mw fully integrated real-time visual-inertial odometry acceler- ator for autonomous navigation of nano drones,” arXiv preprint arXiv:1809.05780, 2018.
[21] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, “Unsupervised learn- ing of depth and ego-motion from video,” in CVPR, 2017.
[22] C. Mei and P. Rives, “Single view point omnidirectional camera cal- ibration from planar grids,” in Proceedings 2007 IEEE International Conference on Robotics and Automation, April 2007, pp. 3945–3950.
[23] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. Ieee, 2009, pp. 248–255.
[24] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[25] J. Sock, J. Kim, J. Min, and K. Kwak, “Probabilistic traversability map generation using 3d-lidar and camera,” in 2016 IEEE International Conference on Robotics and Automation (ICRA), May 2016, pp. 5631– 5637.
[26] B. Suger, B. Steder, and W. Burgard, “Traversability analysis for mo- bile robots in outdoor environments: A semi-supervised learning ap- proach based on 3d-lidar data,” in Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015, pp. 3941–3946.
[27] I. Bogoslavskyi, O. Vysotska, J. Serafin, G. Grisetti, and C. Stachniss, “Efficient traversability analysis for mobile robots using the kinect sen- sor,” in 2013 European Conference on Mobile Robots, Sep. 2013, pp. 158–163.
[28] M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena, “From perception to decision: A data-driven approach to end-to-end mo- tion planning for autonomous ground robots,” in IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2017, p. 15271533.
[29] D. Levi, N. Garnett, E. Fetaya, and I. Herzlyia, “Stixelnet: A deep convolutional network for obstacle detection and road segmentation.” in BMVC, 2015, pp. 109–1.
[30] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv preprint arXiv:1511.00561, 2015.
[31] N. Smolyanskiy, A. Kamenev, J. Smith, and S. Birchfield, “Toward low- flying autonomous mav trail navigation using deep neural networks for environmental awareness,” arXiv preprint arXiv:1705.02550, 2017.
[32] S. Daftry, J. A. Bagnell, and M. Hebert, “Learning transferable policies for monocular reactive mav control,” in International Symposium on Experimental Robotics. Springer, 2016, pp. 3–11.
[33] A. Chilian and H. Hirschmller, “Stereo camera based navigation of mo- bile robots on rough terrain,” in 2009 IEEE/RSJ International Con- ference on Intelligent Robots and Systems, Oct 2009, pp. 4571–4576.
[34] P. Gohl, D. Honegger, S. Omari, M. Achtelik, M. Pollefeys, and R. Siegwart, “Omnidirectional visual obstacle detection using embed- ded fpga,” in Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015, pp. 3938–3943.
[35] C. Hne, T. Sattler, and M. Pollefeys, “Obstacle detection for self- driving cars using only monocular cameras and wheel odometry,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sep. 2015, pp. 5101–5108.
[36] A. Tonioni, M. Poggi, S. Mattoccia, and L. Di Stefano, “Unsupervised adaptation for deep stereo,” in The IEEE International Conference on Computer Vision (ICCV), vol. 2, no. 7, 2017, p. 8.
[37] J. Pang, W. Sun, C. Yang, J. Ren, R. Xiao, J. Zeng, and L. Lin, “Zoom and learn: Generalizing deep stereo matching to novel domains,” arXiv preprint arXiv:1803.06641, 2018.
[38] D. Scharstein, H. Hirschmu ̈ller, Y. Kitajima, G. Krathwohl, N. Neˇsi ́c, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in German Conference on Pattern Recognition. Springer, 2014, pp. 31–42.
[39] T. Schps, J. L. Schnberger, S. Galliani, T. Sattler, K. Schindler, M. Pollefeys, and A. Geiger, “A multi-view stereo benchmark with high-resolution images and multi-camera videos,” in 2017 IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR), July 2017, pp. 2538–2547.
[40] J. Michels, A. Saxena, and A. Y. Ng, “High speed obstacle avoidance using monocular vision and reinforcement learning,” in Proceedings of the 22nd international conference on Machine learning. ACM, 2005, pp. 593–600.
[41] P. Chakravarty, K. Kelchtermans, T. Roussel, S. Wellens, T. Tuyte- laars, and L. Van Eycken, “Cnn-based single image obstacle avoidance on a quadrotor,” in Robotics and Automation (ICRA), 2017 IEEE In- ternational Conference on. IEEE, 2017, pp. 6369–6374.
[42] P. Althaus and H. I. Christensen, “Behaviour coordination for navi- gation in office environments,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 3, Sep. 2002, pp. 2298–2304 vol.3.
[43] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” arXiv preprint arXiv:1510.00149, 2015.
[44] M. Mancini, G. Costante, P. Valigi, T. A. Ciarfuglia, J. Delmerico, and D. Scaramuzza, “Toward domain independence for learning-based monocular depth estimation,” IEEE Robotics and Automation Letters, vol. 2, no. 3, pp. 1778–1785, 2017.
[45] J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in Intelligent Robots and Systems (IROS), 2017 IEEE/RSJ International Conference on. IEEE, 2017, pp. 23–30.
[46] B. Chu, V. Madhavan, O. Beijbom, J. Hoffman, and T. Darrell, “Best practices for fine-tuning visual classifiers to new domains,” in European Conference on Computer Vision. Springer, 2016, pp. 435–442.
[47] “Cs231n convolutional neural networks for visual recognition - transfer learning,” http://cs231n.github.io/transfer-learning/, accessed: 2018- 12-19.
[48] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb, “Learning from simulated and unsupervised images through adversarial training.” in CVPR, vol. 2, no. 4, 2017, p. 5.
[49] A. Saxena, M. Sun, and A. Y. Ng, “Make3d: Learning 3d scene struc- ture from a single still image,” IEEE transactions on pattern analysis and machine intelligence, vol. 31, no. 5, pp. 824–840, 2009.
[50] F. Liu, C. Shen, G. Lin, and I. D. Reid, “Learning depth from single monocular images using deep convolutional neural fields.” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 10, pp. 2024–2039, 2016.
[51] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” in Advances in neural information processing systems, 2014, pp. 2366–2374.
[52] H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordi- nal regression network for monocular depth estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2002–2011.
[53] C. Godard, O. Mac Aodha, and G. J. Brostow, “Unsupervised monoc- ular depth estimation with left-right consistency,” in CVPR, vol. 2, no. 6, 2017, p. 7.
[54] T. Zhou, S. Tulsiani, W. Sun, J. Malik, and A. A. Efros, “View synthe- sis by appearance flow,” in European conference on computer vision. Springer, 2016, pp. 286–301.
[55] M. Jaderberg, K. Simonyan, A. Zisserman, et al., “Spatial transformer networks,” in Advances in neural information processing systems, 2015, pp. 2017–2025.
[56] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
[57] D. Caruso, J. Engel, and D. Cremers, “Large-scale direct slam for om- nidirectional cameras,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sept 2015, pp. 141–148.
[58] L. Heng, B. Li, and M. Pollefeys, “Camodocal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry,” in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nov 2013, pp. 1793–1800.
[59] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on International Conference on Machine Learning, ser.ICML’10. USA: Omnipress, 2010, pp. 807–814. [Online]. Available: http://dl.acm.org/citation.cfm?id=3104322.3104425
[60] M. Lin, Q. Chen, and S. Yan, “Network in network,” arXiv preprint arXiv:1312.4400, 2013.
[61] E. W. Weisstein, “Euler angles,” 2009.
[62] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., “Tensorflow: a system for large-scale machine learning.” in OSDI, vol. 16, 2016, pp. 265–283.
[63] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimiza- tion,” arXiv preprint arXiv:1412.6980, 2014.
[64] M. Stone, “Cross-validatory choice and assessment of statistical predic- tions,” Journal of the royal statistical society. Series B (Methodologi- cal), pp. 111–147, 1974.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71498-
dc.description.abstract自主移動機器人與基於視覺之智能輔助科技在未來將會無處不在於所有人的生活中。其中,為了確保使用上的安全,可通行性分析是這些應用中不可或缺的一環。除了準確度以外,是否能夠適應於各種不同的環境與平台是另一個在設計可通行性分析方法時需要被考慮的重要部分。
因此,針對可通行性估計,本論文提出一個兩階段式之卷積神經網路。在第一階段,不同環境之彩色影像將透過特定領域之深度估計網路轉換成各自的相對深度圖,進而降低原先不同環境之彩色影像彼此之間的領域差異。為了避免需要在不同環境蒐集深度影像以供監督式學習,本論文延伸自我監督學習的方法於魚眼影像,並藉此訓練這些特定領域之深度估計網路。在第二階段,不同環境之相對深度圖,其可通行的機率將透過一個通用可通行性估計網路判斷。基於深度影像擁有較小領域差異之假設,此通用可通行性估計網路僅使用來源領域之資料做監督式學習。
透過真實蒐集之資料顯示,本論文所提出之方法擁有在不同環境間較高的轉移性,並同時擁有具競爭力之準確度。此外,在此特定領域之前提下,本論文所使用之神經網路擁有合理的模型複雜度並僅需要單一一個魚眼相機,有很大的潛力被應用於資源限制較大之平台。
zh_TW
dc.description.abstractAutonomous mobile robots and vision-base intelligent assistive technolo- gies are going to be indispensable in everybody’s lives. Among these appli- cations, traversability estimation (TE) is one of the most significant com- ponent to guarantee people’s safety. Besides accuracy, transferability across different environments and platforms is another important design consider- ation for TE.
In this thesis, a two-stage convolutional neural network is proposed for TE. Input color images are first translated to relative depth maps by domain-specific depth estimation networks. These networks are used to align different domains and are trained with an extended version of self- supervised learning method dedicated to fisheye images. After that, an universal classification network trained in the source domain is used to es- timate traversability with relative depth maps from all domains.
Experiments on the real-world data show that the proposed method has competitive accuracy with higher transferability among different environ- ments. In addition, it has a reasonable model complexity under domain- specific premise and requires only a single fisheye camera, which is also suitable for resource-constrained platforms.
en
dc.description.provenanceMade available in DSpace on 2021-06-17T06:01:54Z (GMT). No. of bitstreams: 1
ntu-108-R05943001-1.pdf: 47584565 bytes, checksum: 06bb89f73271cf6320a600b538946db1 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontentsAbstract xi
1 Introduction 1
1.1 Motivation............................ 1
1.2 Design Considerations for Traversability Estimation . . . . . 2
1.3 MajorContributions ...................... 3
1.4 ThesisOrganization....................... 5
2 Background 7
2.1 TraversabilityEstimation.................... 7
2.1.1 Overview ........................ 7
2.1.2 Monocular Depth Estimation for Traversability Esti- mation....................... 8
2.2 DomainAdaptation....................... 10
2.2.1 Overview ........................ 10
2.2.2 Relationship between Unsupervised Domain Adapta- tionandtheProposedApproach . . . . . . . . . . . 11
2.3 MonocularDepthEstimation ................. 12
2.3.1 Overview ........................ 12
2.3.2 Self-Supervised Learning of Monocular Depth . . . . 13
2.4 Summary ............................ 14
3 Self-Supervised Learning of Domain-Specific Depth From Fisheye Video 15
3.1 PreliminariesandNotations .................. 17
3.2 ViewSynthesiswithUnifiedCameraModel . . . . . . . . . 17
3.3 NetworkArchitecture...................... 19
3.3.1 DepthEstimation.................... 19
3.3.2 CameraMotionEstimation .............. 22
3.4 LossDefinitionwithFisheyeMask............... 22
3.4.1 PhotometricConsistencyLoss............. 23
3.4.2 SmoothnessLoss .................... 24
3.4.3 TotalLoss........................ 24
3.5 Experiment ........................... 24
3.5.1 Dataset ......................... 25
3.5.2 ImplementationDetails................. 28
3.5.3 ExperimentalSetup................... 28
3.5.4 EvaluationMetric.................... 30
3.5.5 Results.......................... 30
3.6 Summary ............................ 30
4 Transferable Traversability Estimation with a Single Fish- eye Camera 35
4.1 ProblemDefinition ....................... 36
4.2 MajorAssumptions....................... 36
4.3 Two-StageNeuralNetwork................... 37
4.3.1 Domain-Specific Depth Estimation Network . . . . . 37
4.3.2 Universal Traversability Estimation Network . . . . . 39
4.4 Experiment ........................... 40
4.4.1 Dataset ......................... 40
4.4.2 ImplementationDetails................. 41
4.4.3 ExperimentalSetup................... 43
4.4.4 EvaluationMetric.................... 45
4.4.5 Results.......................... 45
4.5 FutureWork........................... 47
4.6 Summary ............................ 51
5 Conclusion 55
Bibliography 57
dc.language.isoen
dc.subject非監督領域適應zh_TW
dc.subject跨環境之轉移性zh_TW
dc.subject跨平台之轉移性zh_TW
dc.subject單眼深度估計zh_TW
dc.subject可通行性估計zh_TW
dc.subjecttraversability estimationen
dc.subjectmonocular depth estimationen
dc.subjectun- supervised domain adaptationen
dc.subjecttransferability across environments and plat- formsen
dc.title使用單一魚眼相機與自我監督學習特定領域深度於可遷移之可通行性估計zh_TW
dc.titleSelf-Supervised Learning of Domain-Specific Depth for Transferable Traversability Estimation with a Single Fisheye Cameraen
dc.typeThesis
dc.date.schoolyear107-1
dc.description.degree碩士
dc.contributor.oralexamcommittee簡韶逸,楊佳玲,徐宏民
dc.subject.keyword可通行性估計,單眼深度估計,非監督領域適應,跨環境之轉移性,跨平台之轉移性,zh_TW
dc.subject.keywordtraversability estimation,monocular depth estimation,un- supervised domain adaptation,transferability across environments and plat- forms,en
dc.relation.page65
dc.identifier.doi10.6342/NTU201900348
dc.rights.note有償授權
dc.date.accepted2019-01-31
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電子工程學研究所zh_TW
顯示於系所單位:電子工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
46.47 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved