Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工程科學及海洋工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92738
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor丁肇隆zh_TW
dc.contributor.advisorChao-Lung Tingen
dc.contributor.author徐聖淮zh_TW
dc.contributor.authorSheng-Huai Hsuen
dc.date.accessioned2024-06-18T16:05:47Z-
dc.date.available2024-06-19-
dc.date.copyright2024-06-18-
dc.date.issued2024-
dc.date.submitted2024-06-13-
dc.identifier.citation[1] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020
[2] A. Sabater, L. Montesano, and A.C. Murillo, “Robust and efficient post-processing for video object detection,” In Proceedings of the 2020 IEEE International Conference on Intelligent Robots and Systems, pp. 10536-10542, 2020.
[3] C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, , M. Cheng, W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei, and X. Wei, “YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications,” arXiv preprint arXiv:2209.02976, 2022.
[4] C. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7464-7475, 2023.
[5] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” arXiv preprint arXiv:1710.09412, 2017.
[6] H. Zhang, Y. Wang, F. Dayoub, and N. Sunderhauf, “Varifocalnet: An iou-aware dense object detector,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8514-8523, 2021.
[7] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, 779-788, 2016.
[8] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263-7271, 2017.
[9] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
[10] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” In Proceedings of the IEEE international conference on computer vision, pp. 2961-2969, 2017.
[11] L. He, Q. Zhou, X. Li, L. Niu, G. Cheng, X. Li, W. Liu, Y. Tong, L. Ma, and L. Zhang, “End-to-end video object detection with spatial-temporal transformers,” In Proceedings of the 29th ACM International Conference on Multimedia, pp. 1507-1516, 2021.
[12]O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang, “DeblurGAN-v2: Deblurring (orders-of-magnitude) faster and better,” In Proceedings of the IEEE international conference on computer vision, pp. 8878-8887, 2019.
[13] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, H. Zhiheng, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and F.-F. Li, “Imagenet large scale visual recognition challenge,” International journal of computer vision, pp. 211-252, 2015.
[14] P. Goyal, P. Dollár, R. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y.-Q. Jia, and K. He, “Accurate, large minibatch SGD: Training imagenet in 1 hour,” arXiv preprint arXiv:1706.02677, 2017.
[15] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014.
[16] R. Girshick, “Fast R-CNN,” In Proceedings of the IEEE international conference on computer vision, pp. 1440-1448, 2015.
[17] S. Zheng, Y. Wu, S. Jiang, C. Lu, and G. Gupta, “Deblur-yolo: Real-time object detection with efficient blind motion deblurring,” In Proceedings of the International Joint Conference on Neural Networks, pp. 1-8, 2021.
[18] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” In Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
[19] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single shot multibox detector,” In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, pp. 21-37, 2016.
[20] X. Zhu, Y. Wang, J. Dai, L. Yuan, and Y. Wei, “Flow-guided feature aggregation for video object detection,” In Proceedings of the IEEE international conference on computer vision, pp. 408-417, 2017.
[21] Y. Chen, Y. Cao, H. Hu, and L. Wang, “Memory enhanced global-local aggregation for video object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 10337-10346, 2020.
[22] Y. Shi, N. Wang, and X. Guo, “YOLOV: Making still image object detectors great at video object detection,” In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2254-2262, 2023.
[23] Y.-F. Zhang, W. Ren, Z. Zhang, Z. Jia, L. Wang, and T. Tan, “Focal and efficient IOU loss for accurate bounding box regression,” Neurocomputing, pp. 146-157, 2022.
[24] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO series in 2021,” arXiv preprint arXiv:2107.08430, 2021.
[25] 王家瑜, “基於模糊影像之物件偵測, ” 碩士, 資訊工程研究所, 國立臺灣大學, 臺北市, 2022.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92738-
dc.description.abstract自21世紀電腦運算速度呈爆發性成長後,深度學習方法開始在許多領域進行推行與應用,其中針對物件偵測所設計的類神經網路以YOLO系列為大宗。然而在真實場景應用中,常需要面對影像因為錄製者本身的晃動、鏡頭變焦、物體移動,甚至是場景內的霧氣所帶來影像模糊的問題。本研究使用YOLOX作為基礎模型,改善了傳統YOLO模型應用於單幀模糊影像的表現,並與現有的多幀偵測方法YOLOV進行結合,實現在單幀與多幀情境下,均能進行穩定預測的強健性類神經網路。本研究改進了現有靜態物件偵測模型的前處理方法,額外加入了全局灰階高斯模糊影像進行訓練,並優化損失函數以契合模糊影像的預測需求,實現在性能改善的同時,又不需額外時間來進行預測的新模糊影像偵測模型,並兼具新網路模型應用於各情境及新模型的泛用性。zh_TW
dc.description.abstractSince the explosive growth in computing speed in the 21st century, many applications of deep learning have been implemented across various fields. Currently, the neural networks designed for object detection primarily consist of the YOLO series. However, in real-world applications, images often face challenges such as motion blur from the recorder''s movement, camera zoom, object movement, or even image blurring due to fog within the scene. This study utilizes YOLOX as the base model, improving the performance of traditional YOLO models applied to single-frame blurry images. It integrates with existing multi-frame detection methods like YOLOV to achieve robust neural networks capable of stable predictions in both single-frame and multi-frame scenarios. The study enhances the preprocessing methods of existing static object detection models by incorporating new globally grayscale Gaussian blurry images for training. It optimizes the loss function to meet the predictive needs of blurry images, achieving performance improvements without requiring additional time for predicting new blurry image detection models. This approach also ensures the versatility of the new network model across various scenarios and the general applicability of the new model.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-06-18T16:05:46Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-06-18T16:05:47Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents誌謝 i
摘要 ii
Abstract iii
目次 iv
圖次 vi
表次 vii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目標 3
1.3 論文架構 4
第二章 相關研究 5
2.1 單幀靜態物件偵測 5
2.2 視訊物件偵測 7
2.3 模糊物件偵測 9
2.4 小結 10
第三章 研究方法 11
3.1 問題定義 11
3.2 研究架構與設計 12
3.3 研究範圍與限制 23
第四章 實驗結果與討論 25
4.1 資料集篩選 25
4.2 單幀情境下識別靜態影像的清晰物件 29
4.3 單幀情境下識別動態影像的模糊物件 40
4.4 多幀情境下識別動態影像的模糊物件 48
4.5 小結 50
第五章 結論與未來展望 52
5.1 研究結論 52
5.2 未來展望 53
參考文獻 54
附錄 56
-
dc.language.isozh_TW-
dc.subject視訊物件偵測zh_TW
dc.subject模糊物件偵測zh_TW
dc.subject影像處理zh_TW
dc.subject類神經網路zh_TW
dc.subject高斯模糊zh_TW
dc.subjectGaussian Bluren
dc.subjectNeural Networksen
dc.subjectImage Processingen
dc.subjectVideo Object Detectionen
dc.subjectBlur Object Detectionen
dc.title應用於偵測視訊模糊物件的強健性類神經網路zh_TW
dc.titleRobust Neural Network for Video Object Detection in Blurred Environmentsen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee張恆華;黃乾綱;謝傳璋zh_TW
dc.contributor.oralexamcommitteeHeng-Hua Chang;Chien-Kang Huang;Chuan-Chang Hsiehen
dc.subject.keyword模糊物件偵測,視訊物件偵測,高斯模糊,類神經網路,影像處理,zh_TW
dc.subject.keywordBlur Object Detection,Video Object Detection,Gaussian Blur,Neural Networks,Image Processing,en
dc.relation.page56-
dc.identifier.doi10.6342/NTU202401156-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-06-14-
dc.contributor.author-college工學院-
dc.contributor.author-dept工程科學及海洋工程學系-
顯示於系所單位:工程科學及海洋工程學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf2.45 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved