無人機空拍影像之小物件偵測優化

劉品佑; Pin-Yu Liu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/102286

標題:	無人機空拍影像之小物件偵測優化 Optimization of Small Object Detection in UAV Aerial Imagery
作者:	劉品佑 Pin-Yu Liu
指導教授:	丁肇隆 Chao-Lung Ting
關鍵字:	無人機,物件偵測YOLOv11小物件偵測BiFPNSPD-ConvSAHI UAV,Object DetectionYOLOv11Small Object DetectionBiFPNSPD-ConvSAHI
出版年 :	2026
學位:	碩士
摘要:	無人機（UAV）空拍影像具有視野廣闊、背景複雜及目標尺度微小等特性，使得傳統物件偵測模型在此類場景下常面臨特徵流失與漏檢率高的挑戰。針對此問題，本研究提出了一種基於 YOLOv11 的改良偵測框架，旨在提升無人機視角下的小物件偵測效能。研究首先在模型架構上進行優化，將原有的特徵融合層替換為雙向特徵金字塔網路（BiFPN），引入可學習的權重機制以強化跨尺度特徵的整合；同時，在骨幹網路淺層引入空間-深度轉換卷積（SPD-Conv），以無損下採樣方式替代傳統步長卷積，有效保留了微小目標的細粒度資訊。其次，針對高解析度影像處理，本研究在推論階段整合了切片輔助超推論（SAHI）策略，透過滑動視窗切片與非極大值合併（NMM）技術，進一步最大化小物件的偵測召回率。實驗結果顯示，在具挑戰性的 VisDrone-DET2019 基準資料集上，本研究提出的改良模型在標準全圖推論模式下的 mAP@50 達到 50.40%，優於基線模型 YOLOv11s 的 40.31%；在導入 SAHI 策略後，mAP@50 更大幅提升至 61.82%。與現有的先進模型相比，本方法在僅使用 12.02M 參數量的輕量化基礎上，達到了與重型模型（如 DRONE-YOLO，76.2M 參數）相當的偵測精度，證實了本研究在精度與運算效率之間取得了極佳的平衡，極具實際應用潛力。 UAV imagery is characterized by wide fields of view, complex backgrounds, and minute target scales, posing significant challenges for traditional object detection models regarding feature loss and missed detections. This study proposes an improved YOLOv11 framework to enhance small object detection in drone-view scenarios. The architecture is optimized by replacing the feature fusion layer with a Bidirectional Feature Pyramid Network (BiFPN) to strengthen cross-scale feature integration. Simultaneously, Space-to-Depth Convolution (SPD-Conv) is introduced into the backbone’s shallow layers, utilizing lossless downsampling to preserve fine-grained information. To address high-resolution imagery, the Slicing Aided Hyper Inference (SAHI) strategy is integrated during inference, employing sliding window slicing and Non-Maximum Merging (NMM) to maximize recall. Experimental results on the VisDrone-DET2019 benchmark show that the proposed model achieves an mAP@50 of 50.40% in full-frame mode, outperforming the baseline YOLOv11s (40.31%). With SAHI, the mAP@50 further increases to 61.82%. Compared to state-of-the-art models, this method achieves accuracy comparable to heavy models (e.g., DRONE-YOLO) while using only 12.02M parameters, demonstrating a superior balance between accuracy and computational efficiency.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/102286
DOI:	10.6342/NTU202600891
全文授權:	同意授權(限校園內公開)
電子全文公開日期:	2026-05-01
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	4.37 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。