請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88039
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 貝蘇章 | zh_TW |
dc.contributor.advisor | Soo-Chang Pei | en |
dc.contributor.author | 巫奕璇 | zh_TW |
dc.contributor.author | Yi-Hsuan Wu | en |
dc.date.accessioned | 2023-08-01T16:33:10Z | - |
dc.date.available | 2023-11-09 | - |
dc.date.copyright | 2023-08-01 | - |
dc.date.issued | 2023 | - |
dc.date.submitted | 2023-07-07 | - |
dc.identifier.citation | [1] Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. Multi-view 3d object detection network for autonomous driving. pages 6526–6534, 07 2017.
[2] Lucas Tabelini, Rodrigo Berriel, Thiago M. Paixao, Claudine Badue, Alberto F. De Souza, and Thiago Oliveira-Santos. Keep your eyes on the lane: Real-time attention-guided lane detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 294–302, June 2021. [3] Yifan Liu, Chunhua Shen, Changqian Yu, and Jingdong Wang. Efficient semantic video segmentation with per-frame inference. ECCV, 2020. [4] Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alexander Sorkine-Hornung, and Luc Van Gool. The 2017 davis challenge on video object segmentation. arXiv:1704.00675, 2017. [5] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. Flownet 2.0: Evolution of optical flow estimation with deep networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul 2017. [6] Wenhan Yang, Robby T. Tan, Jiashi Feng, Shiqi Wang, Bin Cheng, and Jiaying Liu. Recurrent multi-frame deraining: Combining physics guidance and adver sarial learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):8569–8586, 2022. [7] Dongdong Chen, Mingming He, Qingnan Fan, Jing Liao, Liheng Zhang, Dongdong Hou, Lu Yuan, and Gang Hua. Gated context aggregation network for image dehaz ing and deraining. WACV 2019, 2018. [8] Wending Yan, Robby T. Tan, Wenhan Yang, and Dengxin Dai. Self-aligned video deraining with transmission-depth consistency. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11961–11971, 2021. [9] Long Ma, Tengyu Ma, Risheng Liu, Xin Fan, and Zhongxuan Luo. Toward fast, flexible, and robust low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5637–5646, 2022. [10] Jianhua Wu Chongsoon Lim Feifan Lv, Feng Lu. Mbllen: Low-light image/video enhancement using cnns. In British Machine Vision Conference (BMVC), 2018. [11] Fan Zhang, Yu Li, Shaodi You, and Ying Fu. Learning temporal consistency for low light video enhancement from single images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4967–4976, June 2021. [12] Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. Detectron2. https://github.com/facebookresearch/detectron2, 2019. [13] Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, and Chen Change Loy. Self supervised learning via conditional motion propagation. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), June 2019. [14] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus En zweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [15] F. Perazzi A. Montes K.-K. Maninis S. Caelles, J. Pont-Tuset and L. Van Gool. The 2019 davis challenge on vos: Unsupervised multi-object segmentation. arXiv:1905.00737, 2019. [16] Ping Hu, Fabian Caba, Oliver Wang, Zhe Lin, Stan Sclaroff, and Federico Perazzi. Temporally distributed networks for fast video semantic segmentation. 2020. [17] Boyi Li, Wenqi Ren, Dengpan Fu, Dacheng Tao, Dan Feng, Wenjun Zeng, and Zhangyang Wang. Benchmarking single-image dehazing and beyond. IEEE Transactions on Image Processing, 28(1):492–505, 2019. [18] Chenxu Luo, Xiaodong Yang, and Alan Yuille. Exploring simple 3d multi-object tracking for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10488–10497, October 2021. [19] Zachary Teed and Jia Deng. Raft: Recurrent all-pairs field transforms for optical flow. ECCV, 2020. [20] Dongwei Ren, Wangmeng Zuo, Qinghua Hu, Pengfei Zhu, and Deyu Meng. Progres sive image deraining networks: A better and simpler baseline. In IEEE Conference on Computer Vision and Pattern Recognition, 2019. [21] Tai-Xiang Jiang, Ting-Zhu Huang, Xi-Le Zhao, Liang-Jian Deng, and Yao Wang. Fastderain: A novel video rain streak removal method using directional gradient priors. IEEE Transactions on Image Processing, 28(4):2089–2102, 2018. [22] Yu Li, Robby T Tan, Xiaojie Guo, Jiangbo Lu, and Michael S Brown. Rain streak removal using layer priors. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2736–2744, 2016. [23] Xueyang Fu, Qi Qi, Zheng-Jun Zha, Yurui Zhu, and Xinghao Ding. Rain streak removal via dual graph convolutional network. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 1352–1360, 2021. [24] Wenhan Yang, Robby T. Tan, Shiqi Wang, and Jiaying Liu. Self-learning video rain streak removal: When cyclic consistency meets temporal correspondence. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. [25] Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Baojin Huang, Yimin Luo, Ji ayi Ma, and Junjun Jiang. Multi-scale progressive fusion network for single image deraining. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. [26] Wenhan Yang, Robby T. Tan, Jiashi Feng, Jiaying Liu, Zongming Guo, and Shuicheng Yan. Deep joint rain detection and removal from a single image. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1685–1694, 2017. [27] Haokui Zhang, Chunhua Shen, Ying Li, Yuanzhouhan Cao, Yu Liu, and Youliang Yan. Exploiting temporal consistency for real-time video depth estimation, 2019. [28] Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, and Ming-Hsuan Yang. Learning blind video temporal consistency. In Proceedings of the European conference on computer vision (ECCV), pages 170–185, 2018. [29] Chenyang Lei, Yazhou Xing, and Qifeng Chen. Blind video temporal consistency via deep video prior. Advancesin Neural Information Processing Systems, 33:1083–1093, 2020. [30] Fan Zhang, Yu Li, Shaodi You, and Ying Fu. Learning temporal consistency for low light video enhancement from single images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4967–4976, 2021. [31] Zhengyang Wang and Shuiwang Ji. Smoothed dilated convolutions for improved dense prediction. 2018. [32] Jampani V.-Balles L. Kim K. Sun D.-Wulff J. Black M. J. Ranjan, A. Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. 2018. [33] Garg Ravi-Weerasekera Chamara S. Li Kejie Agarwal-Harsh Zhan, Huangying and Ian Reid. Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction. 2018. [34] Siyuan Li, Yue Luo, Ye Zhu, Xun Zhao, Yu Li, and Ying Shan. Enforcing temporal consistency in video depth estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021. [35] Zongsheng Yue, Jianwen Xie, Qian Zhao, and Deyu Meng. Semi-supervised video deraining with dynamical rain generator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. [36] Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. Image quality assess ment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004. [37] Anish Mittal, Rajiv Soundararajan, and Alan C. Bovik. Making a“completely blind" image quality analyzer. IEEE Signal Processing Letters, 20(3):209–212, 2013. [38] X. Guo, Y. Li, and H. Ling. Lime: Low-light image enhancement via illumination map estimation. IEEE Transactions on Image Processing, 26(2):982–993, 2017. [39] Yonghua Zhang, Jiawan Zhang, and Xiaojie Guo. Kindling the darkness: A practical low-light image enhancer. In Proceedings of the 27th ACM International Conference on Multimedia, MM ’19, pages 1632–1640, New York, NY, USA, 2019. ACM. [40] Chunle Guo Guo, Chongyi Li, Jichang Guo, Chen Change Loy, Junhui Hou, Sam Kwong, and Runmin Cong. Zero-reference deep curve estimation for low-light im age enhancement. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pages 1780–1789, June 2020. [41] Edwin H Land. The retinex theory of color vision. Scientific american, 237(6):108–129, 1977. [42] Mohammad Abdullah-Al-Wadud, Md Hasanul Kabir, M Ali Akber Dewan, and Ok sam Chae. A dynamic histogram equalization for image contrast enhancement. IEEE transactions on consumer electronics, 53(2):593–600, 2007. [43] Heng-Da Cheng and XJ Shi. A simple and effective histogram equalization approach to image enhancement. Digital signal processing, 14(2):158–170, 2004. [44] Mading Li, Jiaying Liu, Wenhan Yang, Xiaoyan Sun, and Zongming Guo. Structure revealing low-light image enhancement via robust retinex model. IEEE Transactions on Image Processing, 27(6):2828–2841, 2018. [45] Chunle Guo, Chongyi Li, Jichang Guo, Chen Change Loy, Junhui Hou, Sam Kwong,and Runmin Cong. Zero-reference deep curve estimation for low-light image en hancement, 2020. [46] Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, and Zhangyang Wang. Enlightengan: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing, 30:2340–2349, 2021. [47] Feifan Lv, Yu Li, and Feng Lu. Attention guided low-light image enhancement with a large scale low-light simulation dataset. International Journal of Computer Vision, 129(7):2175–2193, 2021. [48] Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, and Ming-Hsuan Yang. Learning blind video temporal consistency. In European Conference on Computer Vision, 2018. [49] Fitsum Reda, Robert Pottorff, Jon Barker, and Bryan Catanzaro. flownet2-pytorch: Pytorch implementation of flownet 2.0: Evolution of optical flow estimation with deep networks. https://github.com/NVIDIA/flownet2-pytorch, 2017. | - |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88039 | - |
dc.description.abstract | 近年來,電腦視覺的各領域在單張影像處理的表現優異。然而,當這些架構應用在連續的多幀影像時,常因在架構只有訓練在單張影像,不會藉由前後幀的資訊來彌補缺失,而造成結果有高度的不穩定性和錯誤率。因此需要針對連續的多幀影像設計一個有效的架構。另外,因近年來自駕車的蓬勃發展,電腦視覺在自駕車方面的研究也日益增加。多數任務在正常天氣下有良好的表現,但對於夜晚、下雨和起霧這些極端天氣仍然是極大的挑戰。
本篇論文主要分析三大任務並解決其中的問題:語義分割、去雨和低光源增強,並且針對自駕車的視訊影像去做處理。在每個任務中我們會提出新的架構並加入時序一致性的限制來加強單張影像和連續的多幀影像的結果,並且我們採用了訓練和測試時不同架構的吸想法,讓每個任務都可以達到即時計算的效果,並應用在自駕車系統上。這篇論文主要有兩個貢獻:第一,我們提出的架構成功的解決了在基於單幀影戲那個架構的時序不一致性;第二,我們在各個章節的實驗中驗證了新架構的有效性,大大的在增強視覺上的表現且大大減少各個任務計算所需要的時間。整體來說,這篇論文設計了穩定且有效的架構,在未來不僅針對連續的多幀影像有更廣大的應用,可以更進一步應用在真實自駕車的系統上。 | zh_TW |
dc.description.abstract | Computer vision has made remarkable progress in single-image processing across various fields in recent years. However, when these architectures are applied to videos, the results are often inconsistent and inaccurate due to a lack of utilization of information from adjacent frames. Therefore, it is essential to design dedicated architectures for video processing. With the growing development of self-driving cars, computer vision research in this area has also increased, with challenging weather conditions such as rain, fog, and night posing significant obstacles.
This thesis focuses on three crucial tasks for processing video frames in self-driving cars: semantic segmentation, rain removal, and low-light enhancement. We propose a novel architecture for each task and introduce temporal consistency constraints to improve the results of both single images and videos. Additionally, we use training and testing with different architectures to achieve real-time processing for each task and apply them to autonomous driving car systems. This thesis has two main contributions: first, our proposed architecture effectively resolves temporal inconsistencies based on the single-frame image architecture; second, we validate the efficacy of the new architecture through a range of experiments in each chapter, significantly improving visual performance and reducing computation time for each task. Overall, this thesis designs a stable and effective architecture with broader applications for videos that can potentially be implemented in real autonomous driving car systems. | en |
dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-01T16:33:10Z No. of bitstreams: 0 | en |
dc.description.provenance | Made available in DSpace on 2023-08-01T16:33:10Z (GMT). No. of bitstreams: 0 | en |
dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xvii Chapter 1 Introduction 1 1.1 Autonomous Driving . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Difference between single image and video . . . . . . . . . . . . . . 2 1.3 Optical flow estimation . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Temporal consistency . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 2 Deraining 7 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Gated Context Aggregation Network for Image Dehazing and De raining (GCANet) . . . . 11 2.2.2 Self-Aligned Video Deraining with Transmission-Depth Consistency 12 2.2.3 Recurrent Multi-Frame Deraining: Combining Physics Guidance and Adversarial Learning (MFGAN) . . 14 2.3 Proposed method for video deraining . . . . . . . . . . . . . . . . . 15 2.3.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.3 Training Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Chapter 3 Low Light Enhancement 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.1 Toward Fast, Flexible, and Robust Low-Light Image Enhancement(SCI) . . . . . . . . . 32 3.2.2 MBLLEN: Low-light Image/Video Enhancement Using CNNs . . . 33 3.2.3 Learning Temporal Consistency for Low Light Video Enhancement from Single Images . . . . . 35 3.3 Proposed method of Low light video enhancement . . . . . . . . . . 36 3.3.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3.2 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.3.3 Training strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Chapter 4 Semantic Segmentation 49 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.1 Efficient Semantic Video Segmentation with Per-frame Inference(ETC) . . . . . . . . . . . . . 51 4.2.2 Temporally Distributed Networks for Fast Video Semantic Segmen tation (TDNet) . . . . . . 52 4.3 Proposed method of Semantic Video Segmentation (ET) . . . . . . . 54 4.3.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.2 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3.3 Training Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.4 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.5 Experiment Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Chapter 5 Conclusion and Future Work 65 References 67 | - |
dc.language.iso | en | - |
dc.title | 自駕車場景視訊的影像去雨、低光源增強和語義分割 | zh_TW |
dc.title | Driving Scene Video Deraining, Low Light Enhancement and Semantic Segmentation for Autonomous Vehicles | en |
dc.type | Thesis | - |
dc.date.schoolyear | 111-2 | - |
dc.description.degree | 碩士 | - |
dc.contributor.oralexamcommittee | 杭學鳴;丁建均;曾建誠;鍾國亮 | zh_TW |
dc.contributor.oralexamcommittee | Hsueh-Ming Hang;Jian-Jiun Ding;Chien-Cheng Tseng;Kuo-Liang Chung | en |
dc.subject.keyword | 深度學習,光流估計,時序一致性,影片去雨,影片實例分割,影片低光源亮度增強, | zh_TW |
dc.subject.keyword | Deep learning,Optical Flow estimation,Temporal consistency,Video Deraining,Semantic Video Segmentation,Low-light video enhancement, | en |
dc.relation.page | 73 | - |
dc.identifier.doi | 10.6342/NTU202301340 | - |
dc.rights.note | 未授權 | - |
dc.date.accepted | 2023-07-10 | - |
dc.contributor.author-college | 電機資訊學院 | - |
dc.contributor.author-dept | 電信工程學研究所 | - |
顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 目前未授權公開取用 | 21.8 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。