基於深度學習語義分割之城市道路汽車轉向操控

Kuo-Hsin Tu; 塗國星

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/2313

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	傅立成
dc.contributor.author	Kuo-Hsin Tu	en
dc.contributor.author	塗國星	zh_TW
dc.date.accessioned	2021-05-13T06:39:07Z	-
dc.date.available	2020-08-24
dc.date.available	2021-05-13T06:39:07Z	-
dc.date.copyright	2017-08-24
dc.date.issued	2017
dc.date.submitted	2017-08-16
dc.identifier.citation	[1] Global Status Eport on Road Safety 2015. Available: http://www.who.int/violence_injury_prevention/road_safety_status/2015/en/ [2] J. Janai, F. Güney, A. Behl, and A. Geiger, 'Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art,' arXiv preprint arXiv:1704.05519, 2017. [3] V. Sze, Y.-H. Chen, T.-J. Yang, and J. Emer, 'Efficient Processing of Deep Neural Networks: A Tutorial and Survey,' arXiv preprint arXiv:1703.09039, 2017. [4] S. Ren, K. He, R. Girshick, and J. Sun, 'Faster R-Cnn: Towards Real-Time Object Detection with Region Proposal Networks,' in Advances in Neural Information Processing Systems, pp. 91-99, 2015. [5] J. Long, E. Shelhamer, and T. Darrell, 'Fully Convolutional Networks for Semantic Segmentation,' in IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, 2015. [6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, 'Imagenet Classification with Deep Convolutional Neural Networks,' in Advances in Neural Information Processing Systems, pp. 1097-1105, 2012. [7] C. Finn, X. Y. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, 'Deep Spatial Autoencoders for Visuomotor Learning,' in IEEE International Conference on Robotics and Automation, pp. 512-519, 2016. [8] S. Levine, C. Finn, T. Darrell, and P. Abbeel, 'End-to-End Training of Deep Visuomotor Policies,' Journal of Machine Learning Research, vol. 17, no. 39, pp. 1-40, 2016. [9] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, and J. Zhang, 'End to End Learning for Self-Driving Cars,' arXiv preprint arXiv:1604.07316, 2016. [10] A. H. van der Heijden, 'Two Stages in Visual Information Processing and Visual Perception?,' Visual Cognition, vol. 3, no. 4, pp. 325-362, 1996. [11] T. S. Lee and A. L. Yuille, 'Efficient Coding of Visual Scenes by Grouping and Segmentation,' in Bayesian Brain: Probabilistic Approaches to Neural Coding, pp. 141-185, 2006. [12] B. Huval, T. Wang, S. Tandon, J. Kiske, W. Song, J. Pazhayampallil, M. Andriluka, P. Rajpurkar, T. Migimatsu, and R. Cheng-Yue, 'An Empirical Evaluation of Deep Learning on Highway Driving,' arXiv preprint arXiv:1504.01716, 2015. [13] D. A. Pomerleau, 'Alvinn: An Autonomous Land Vehicle in a Neural Network,' in Advances in Neural Information Processing Systems, pp. 305-313, 1989. [14] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, 'Deepdriving: Learning Affordance for Direct Perception in Autonomous Driving,' in IEEE International Conference on Computer Vision, pp. 2722-2730, 2015. [15] The Open Racing Car Simulator Website. Available: http://torcs.sourceforge.net/ [16] S. Yang, S. Konam, C. Ma, S. Rosenthal, M. Veloso, and S. Scherer, 'Obstacle Avoidance through Deep Networks Based Intermediate Perception,' arXiv preprint arXiv:1704.08759, 2017. [17] U. Muller, J. Ben, E. Cosatto, B. Flepp, and Y. L. Cun, 'Off-Road Obstacle Avoidance through End-to-End Learning,' in Advances in Neural Information Processing Systems, pp. 739-746, 2006. [18] A. Giusti, J. Guzzi, D. C. Cireşan, F.-L. He, J. P. Rodríguez, F. Fontana, M. Faessler, C. Forster, J. Schmidhuber, and G. Di Caro, 'A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots,' IEEE Robotics and Automation Letters, vol. 1, no. 2, pp. 661-667, 2016. [19] C. Chen, 'Extracting Cognition out of Images for the Purpose of Autonomous Driving,' Ph.D., Princeton University, 2016. [20] L. G. Appelbaum and A. M. Norcia, 'Attentive and Pre-Attentive Aspects of Figural Processing,' Journal of Vision, vol. 9, no. 11, pp. 18-18, 2009. [21] S. Chernova and M. Veloso, 'Interactive Policy Learning through Confidence-Based Autonomy,' Journal of Artificial Intelligence Research, vol. 34, no. 1, p. 1, 2009. [22] S. Ross and D. Bagnell, 'Efficient Reductions for Imitation Learning,' in International Conference on Artificial Intelligence and Statistics, pp. 661-668, 2010. [23] D. Silver, J. Bagnell, and A. Stentz, 'High Performance Outdoor Navigation from Overhead Data Using Imitation Learning,' Robotics: Science and Systems IV, Zurich, Switzerland, 2008. [24] S. Ross, G. J. Gordon, and D. Bagnell, 'A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning,' in International Conference on Artificial Intelligence and Statistics, pp. 627-635, 2011. [25] J. Zhang and K. Cho, 'Query-Efficient Imitation Learning for End-to-End Simulated Driving,' in AAAI Conference on Artificial Intelligence, pp. 2891-2897, 2017. [26] S. J. Pan and Q. Yang, 'A Survey on Transfer Learning,' IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345-1359, 2010. [27] Y. Bengio, 'Deep Learning of Representations for Unsupervised and Transfer Learning,' in ICML Workshop on Unsupervised and Transfer Learning, pp. 17-36, 2012. [28] S. Ioffe and C. Szegedy, 'Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,' in International Conference on Machine Learning, pp. 448-456, 2015. [29] K. Simonyan and A. Zisserman, 'Very Deep Convolutional Networks for Large-Scale Image Recognition,' arXiv preprint arXiv:1409.1556, 2014. [30] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, 'Microsoft Coco: Common Objects in Context,' in European Conference on Computer Vision, pp. 740-755, 2014. [31] H. Noh, S. Hong, and B. Han, 'Learning Deconvolution Network for Semantic Segmentation,' in IEEE International Conference on Computer Vision, pp. 1520-1528, 2015. [32] P. O. Pinheiro, 'Large-Scale Image Segmentation with Convolutional Networks,' Ph.D., École Polytechnique Fédérale de Lausanne, 2017. [33] L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, 'Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PP, no. 99, pp. 1-1, 2017. [34] D. Eigen and R. Fergus, 'Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture,' in IEEE International Conference on Computer Vision, pp. 2650-2658, 2015. [35] G. Lin, C. Shen, A. van den Hengel, and I. Reid, 'Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation,' in IEEE Conference on Computer Vision and Pattern Recognition, pp. 3194-3203, 2016. [36] V. Badrinarayanan, A. Kendall, and R. Cipolla, 'Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation,' arXiv preprint arXiv:1511.00561, 2015. [37] F. J. Huang, Y.-L. Boureau, and Y. LeCun, 'Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition,' in IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2007. [38] Udacity Self-Driving Car Challenge 2 Dataset. Available: https://github.com/udacity/self-driving-car/tree/master/challenges/challenge-2 [39] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, 'The Cityscapes Dataset for Semantic Urban Scene Understanding,' in IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213-3223, 2016. [40] D. P. Kingma and J. Ba, 'Adam: A Method for Stochastic Optimization,' in International Conference on Learning Representations, 2014. [41] L. Bottou, 'Stochastic Gradient Descent Tricks,' in Neural Networks: Tricks of the Trade: Springer, pp. 421-436, 2012. [42] S. Ji, W. Xu, M. Yang, and K. Yu, '3d Convolutional Neural Networks for Human Action Recognition,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221-231, 2013. [43] Teaching a Machine to Steer a Car. Available: https://medium.com/udacity/teaching-a-machine-to-steer-a-car-d73217f2492c [44] Model of Team Rwightman in Udacity Self-Driving Car Challenge 2. Available: https://github.com/udacity/self-driving-car/blob/master/steering-models/evaluation/rwightman.py [45] K. He, X. Zhang, S. Ren, and J. Sun, 'Deep Residual Learning for Image Recognition,' in IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016. [46] Model of Team Epoch in Udacity Self-Driving Car Challenge 2. Available: https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/cg23 [47] Udacity Self-Driving Car Challenge 2 Leaderboard. Available: https://github.com/udacity/self-driving-car/tree/master/challenges/challenge-2#final-leaderboard [48] T. Mikolov, M. Karafiat, and L. Burget, 'Recurrent Neural Network Based Language Model,' in Eleventh Annual Conference of the International Speech Communication Association, 2010.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/2313	-
dc.description.abstract	在視覺式自動駕駛系統中，感知與控制是兩個重要且待解決的議題。此外，由於深度卷積神經網路在解決感知與控制問題上有非常好能力，使得深度卷積神經網路成為視覺式自動駕駛系統的解決方案之一。在本論文中，我們證明語義分割可以用來提升視覺式自動駕駛系統的效能。論文中提出了一個使用語義感知並基於端對端深度卷積神經網路的方法來解決自動駕駛中的視覺式控制問題。所提出的方法具有兩個階段並透過影像輸入來預測汽車轉向操控。在第一個階段中，使用一個深度卷積神經網路從輸入影像產生語義分割的結果，在第二個階段中則使用另一個深度卷積神經網路從語義分割資訊來預測出汽車轉向操控。在實驗中，我們使用一個公開的汽車駕駛資料集來評估所提出的方法，實驗結果顯示該方法能達到比一般端對端的深度卷積神經網路方法更好的結果。	zh_TW
dc.description.abstract	In vision based autonomous driving systems, perception and control tasks are two critical problems to be solved. The effectiveness of deep convolutional neural networks (CNNs) in solving visual perception and control tasks has made CNNs a desirable solution for autonomous driving. In this thesis, we show that semantic segmentation can be applied to enhance the performance of a vision based autonomous driving system. We propose an end-to-end CNN architecture with semantic perception to solve the vision based control problem in autonomous driving. The proposed approach is a two-stage CNN architecture that takes a monocular image and outputs a steering angle. In the first stage, a CNN module is used to generate semantic segmentation from the input image. In the second stage, another CNN module is used to take advantage of the semantic perception to predict steering angles. In the experiment, a publicly available dataset of human driving data is used to evaluate the proposed method. Experimental results demonstrate that the proposed method enhance the results of the typical end-to-end CNN approach.	en
dc.description.provenance	Made available in DSpace on 2021-05-13T06:39:07Z (GMT). No. of bitstreams: 1 ntu-106-P04922004-1.pdf: 2959469 bytes, checksum: 1b0b0ba368fa10312e7ccc848c4b14d9 (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii Abstract iv Contents v List of Figures vii List of Tables ix Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Related Work 4 1.3 Contributions 7 1.4 Thesis Organization 7 Chapter 2 Preliminaries 9 2.1 Vision-based Autonomous Driving Systems 9 2.2 Imitation Learning 10 2.3 Transfer Learning 11 2.4 Convolutional Neural Networks (CNNs) 13 2.5 Semantic Segmentation 19 Chapter 3 Methodology 23 3.1 System Overview 23 3.2 Semantic Segmentation Generation 24 3.2.1 Encoder Network 25 3.2.2 Decoder Network 27 3.3 Car Steering Angle Prediction 30 Chapter 4 Experiments 34 4.1 Environments 34 4.2 Dataset 35 4.2.1 Cityscapes Dataset 36 4.2.2 Udacity Self-Driving Car Challenge 2 Dataset 35 4.3 Evaluation Metrics 38 4.4 Implementation Details 38 4.4.1 Semantic Segmentation Annotation for Udacity Dataset 38 4.4.2 Baseline Model 39 4.4.3 Perception Network 39 4.4.4 Control Network 40 4.5 Results 40 4.5.1 Overall Performance 40 4.5.2 Analysis of Error Cases 43 4.5.3 Effects of Different Perception Network Models 54 Chapter 5 Conclusion 56 References 58
dc.language.iso	en
dc.subject	車輛轉向	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	卷積神經網路	zh_TW
dc.subject	語義分割	zh_TW
dc.subject	自動駕駛	zh_TW
dc.subject	Deep learning	en
dc.subject	Vehicle steering	en
dc.subject	Autonomous driving	en
dc.subject	Semantic segmentation	en
dc.subject	Convolutional neural networks	en
dc.title	基於深度學習語義分割之城市道路汽車轉向操控	zh_TW
dc.title	A Deep Learning Based Semantic Segmentation Approach for Car Steering on Urban Roads	en
dc.type	Thesis
dc.date.schoolyear	105-2
dc.description.degree	碩士
dc.contributor.coadvisor	蕭培墉
dc.contributor.oralexamcommittee	傅楸善,黃世勳,方瓊瑤
dc.subject.keyword	深度學習,卷積神經網路,語義分割,自動駕駛,車輛轉向,	zh_TW
dc.subject.keyword	Deep learning,Convolutional neural networks,Semantic segmentation,Autonomous driving,Vehicle steering,	en
dc.relation.page	62
dc.identifier.doi	10.6342/NTU201703573
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2017-08-17
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf	2.89 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。