應用於偵測視訊模糊物件的強健性類神經網路

徐聖淮; Sheng-Huai Hsu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92738

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁肇隆	zh_TW
dc.contributor.advisor	Chao-Lung Ting	en
dc.contributor.author	徐聖淮	zh_TW
dc.contributor.author	Sheng-Huai Hsu	en
dc.date.accessioned	2024-06-18T16:05:47Z	-
dc.date.available	2024-06-19	-
dc.date.copyright	2024-06-18	-
dc.date.issued	2024	-
dc.date.submitted	2024-06-13	-
dc.identifier.citation	[1] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020 [2] A. Sabater, L. Montesano, and A.C. Murillo, “Robust and efficient post-processing for video object detection,” In Proceedings of the 2020 IEEE International Conference on Intelligent Robots and Systems, pp. 10536-10542, 2020. [3] C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, , M. Cheng, W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei, and X. Wei, “YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications,” arXiv preprint arXiv:2209.02976, 2022. [4] C. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7464-7475, 2023. [5] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” arXiv preprint arXiv:1710.09412, 2017. [6] H. Zhang, Y. Wang, F. Dayoub, and N. Sunderhauf, “Varifocalnet: An iou-aware dense object detector,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8514-8523, 2021. [7] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, 779-788, 2016. [8] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263-7271, 2017. [9] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018. [10] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” In Proceedings of the IEEE international conference on computer vision, pp. 2961-2969, 2017. [11] L. He, Q. Zhou, X. Li, L. Niu, G. Cheng, X. Li, W. Liu, Y. Tong, L. Ma, and L. Zhang, “End-to-end video object detection with spatial-temporal transformers,” In Proceedings of the 29th ACM International Conference on Multimedia, pp. 1507-1516, 2021. [12]O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang, “DeblurGAN-v2: Deblurring (orders-of-magnitude) faster and better,” In Proceedings of the IEEE international conference on computer vision, pp. 8878-8887, 2019. [13] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, H. Zhiheng, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and F.-F. Li, “Imagenet large scale visual recognition challenge,” International journal of computer vision, pp. 211-252, 2015. [14] P. Goyal, P. Dollár, R. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y.-Q. Jia, and K. He, “Accurate, large minibatch SGD: Training imagenet in 1 hour,” arXiv preprint arXiv:1706.02677, 2017. [15] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014. [16] R. Girshick, “Fast R-CNN,” In Proceedings of the IEEE international conference on computer vision, pp. 1440-1448, 2015. [17] S. Zheng, Y. Wu, S. Jiang, C. Lu, and G. Gupta, “Deblur-yolo: Real-time object detection with efficient blind motion deblurring,” In Proceedings of the International Joint Conference on Neural Networks, pp. 1-8, 2021. [18] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” In Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017. [19] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, pp. 21-37, 2016. [20] X. Zhu, Y. Wang, J. Dai, L. Yuan, and Y. Wei, “Flow-guided feature aggregation for video object detection,” In Proceedings of the IEEE international conference on computer vision, pp. 408-417, 2017. [21] Y. Chen, Y. Cao, H. Hu, and L. Wang, “Memory enhanced global-local aggregation for video object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 10337-10346, 2020. [22] Y. Shi, N. Wang, and X. Guo, “YOLOV: Making still image object detectors great at video object detection,” In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2254-2262, 2023. [23] Y.-F. Zhang, W. Ren, Z. Zhang, Z. Jia, L. Wang, and T. Tan, “Focal and efficient IOU loss for accurate bounding box regression,” Neurocomputing, pp. 146-157, 2022. [24] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO series in 2021,” arXiv preprint arXiv:2107.08430, 2021. [25] 王家瑜, “基於模糊影像之物件偵測, ” 碩士, 資訊工程研究所, 國立臺灣大學, 臺北市, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92738	-
dc.description.abstract	自21世紀電腦運算速度呈爆發性成長後，深度學習方法開始在許多領域進行推行與應用，其中針對物件偵測所設計的類神經網路以YOLO系列為大宗。然而在真實場景應用中，常需要面對影像因為錄製者本身的晃動、鏡頭變焦、物體移動，甚至是場景內的霧氣所帶來影像模糊的問題。本研究使用YOLOX作為基礎模型，改善了傳統YOLO模型應用於單幀模糊影像的表現，並與現有的多幀偵測方法YOLOV進行結合，實現在單幀與多幀情境下，均能進行穩定預測的強健性類神經網路。本研究改進了現有靜態物件偵測模型的前處理方法，額外加入了全局灰階高斯模糊影像進行訓練，並優化損失函數以契合模糊影像的預測需求，實現在性能改善的同時，又不需額外時間來進行預測的新模糊影像偵測模型，並兼具新網路模型應用於各情境及新模型的泛用性。	zh_TW
dc.description.abstract	Since the explosive growth in computing speed in the 21st century, many applications of deep learning have been implemented across various fields. Currently, the neural networks designed for object detection primarily consist of the YOLO series. However, in real-world applications, images often face challenges such as motion blur from the recorder''s movement, camera zoom, object movement, or even image blurring due to fog within the scene. This study utilizes YOLOX as the base model, improving the performance of traditional YOLO models applied to single-frame blurry images. It integrates with existing multi-frame detection methods like YOLOV to achieve robust neural networks capable of stable predictions in both single-frame and multi-frame scenarios. The study enhances the preprocessing methods of existing static object detection models by incorporating new globally grayscale Gaussian blurry images for training. It optimizes the loss function to meet the predictive needs of blurry images, achieving performance improvements without requiring additional time for predicting new blurry image detection models. This approach also ensures the versatility of the new network model across various scenarios and the general applicability of the new model.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-06-18T16:05:46Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-06-18T16:05:47Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝 i 摘要 ii Abstract iii 目次 iv 圖次 vi 表次 vii 第一章緒論 1 1.1 研究背景與動機 1 1.2 研究目標 3 1.3 論文架構 4 第二章相關研究 5 2.1 單幀靜態物件偵測 5 2.2 視訊物件偵測 7 2.3 模糊物件偵測 9 2.4 小結 10 第三章研究方法 11 3.1 問題定義 11 3.2 研究架構與設計 12 3.3 研究範圍與限制 23 第四章實驗結果與討論 25 4.1 資料集篩選 25 4.2 單幀情境下識別靜態影像的清晰物件 29 4.3 單幀情境下識別動態影像的模糊物件 40 4.4 多幀情境下識別動態影像的模糊物件 48 4.5 小結 50 第五章結論與未來展望 52 5.1 研究結論 52 5.2 未來展望 53 參考文獻 54 附錄 56	-
dc.language.iso	zh_TW	-
dc.title	應用於偵測視訊模糊物件的強健性類神經網路	zh_TW
dc.title	Robust Neural Network for Video Object Detection in Blurred Environments	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	張恆華;黃乾綱;謝傳璋	zh_TW
dc.contributor.oralexamcommittee	Heng-Hua Chang;Chien-Kang Huang;Chuan-Chang Hsieh	en
dc.subject.keyword	模糊物件偵測,視訊物件偵測,高斯模糊,類神經網路,影像處理,	zh_TW
dc.subject.keyword	Blur Object Detection,Video Object Detection,Gaussian Blur,Neural Networks,Image Processing,	en
dc.relation.page	56	-
dc.identifier.doi	10.6342/NTU202401156	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2024-06-14	-
dc.contributor.author-college	工學院	-
dc.contributor.author-dept	工程科學及海洋工程學系	-
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf	2.45 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。