卷積網路應用於高畫質無人機影像中小物件偵測的優化辦法

林育銓; Yu-Chuan Lin

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95216

Title:	卷積網路應用於高畫質無人機影像中小物件偵測的優化辦法 An Advanced Technique for Enhancing the Performance of Convolutional Neural Networks in Small Object Detection with High Resolution UAV Image
Authors:	林育銓 Yu-Chuan Lin
Advisor:	韓仁毓 Jen-Yu Han
Keyword:	無人機影像,深度學習,卷積神經網路,小物件偵測,區域提出, UAV imagery,deep learning,CNN,small object detection,regional proposal,
Publication Year :	2024
Degree:	碩士
Abstract:	無人機作為遙測的新興工具，若能結合深度學習進行自動化物件偵測，將能使人們對於無人機的應用範疇有更深的拓展與運用。然而無人機影像與普遍影像不同，其具備高解析度與目標物相對尺寸較小的特性，在應用卷積神經網路技術時，會面臨到高解析度導致的GPU記憶體容量不足的限制。過往研究在解決這項限制時多以降低影像解析度的方式來減少GPU記憶體的需求，但這樣的做法會使目標物的特徵變模糊且稀缺，導致小物件偵測的困境。本研究提出一依照Focus-and-Detect概念的優化辦法，先對影像進行子區域提出，後再針對子區域進行偵測。在沒有降低解析度的情況下減少模型輸入影像的尺寸，減少GPU記憶體需求的同時維持無人機具備的高解析度，進而在相對尺度較小的目標物也能有足夠清晰的特徵，避免了小物件偵測的困境。在子區域提出的部分，本研究發展一子區域提出演算法Modified Selective Search。演算法所提出的候選區域能具備理想子區域的特性，大小適當且包含多個目標物。本研究透過子區域提出方法所訓練的yolov8模型，在VisDrone無人機影像資料集中的高解析度影像上，達到比透過降低影像解析度方式所訓練的yolov8模型更好的表現結果。在測試資料的評估上增加了12%的mAP表現，並且在特徵相似的物件中有更好的區分效果。然子區域提出的方法會需要更多的訓練時間與儲存空間來儲存子區域影像，總結來說是一以時間換取表現以及資源的方法。 Drones, as an emerging tool in remote sensing, already possess extensive application potential in the field of civil engineering. The integration of deep learning for automated object detection could further expand the applications of drones and enhance their utility. However, drone imagery differs from conventional computer vision images, characterized by high resolution and relatively small object sizes. When applying convolution neural networks, this leads to constraints due to GPU memory limitations caused by high resolutions. Previous studies often addressed this limitation by reducing image resolution, which blurs and sparsifies the features of small objects, thus complicating small object detection. This study proposes an optimization approach based on the Focus-and-Detect concept, which involves segmenting the image into sub-regions before processing object detection. This method reduces the need for GPU memory without sacrificing image resolution. Furthermore, the high-resolution capability of drones ensures that even relatively small objects possess sufficient feature content, making the sub-region proposal method particularly suited to drone imagery. In this study, we modify the Selective Search algorithm to propose an optimized sub-region proposal method, Modified Selective Search. The proposed candidate regions by this algorithm have the ideal characteristics of sub-regions, being appropriately sized and containing multiple objects. The yolov8 model trained using this sub-region proposal method achieved superior performance on the high-resolution VisDrone drone imagery dataset compared to models trained with reduced-resolution images, improving the mAP by 12% in test data evaluations and exhibiting enhanced differentiation among vehicles, trucks, and vans, which have similar features. While the sub-region proposal method requires additional training time and storage space, it essentially trades time for improved performance and lower GPU memory requirement.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95216
DOI:	10.6342/NTU202404011
Fulltext Rights:	同意授權(全球公開)
Appears in Collections:	土木工程學系

Files in This Item:

File	Size	Format
ntu-112-2.pdf Until 2025-08-09	32.84 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets