船舶熱成像分類之深度學習模型與資料集分割策略

陳品涵; Pin-Han Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101058

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李佳翰	zh_TW
dc.contributor.advisor	Jia-Han Li	en
dc.contributor.author	陳品涵	zh_TW
dc.contributor.author	Pin-Han Chen	en
dc.date.accessioned	2025-11-27T16:05:43Z	-
dc.date.available	2025-11-28	-
dc.date.copyright	2025-11-27	-
dc.date.issued	2025	-
dc.date.submitted	2025-11-17	-
dc.identifier.citation	蘇青和、許義宏、黃茂信、鄭信鴻、陳子健（2021）。船舶監控預警系統之應用(1/2)-交通量及事故熱點分析。交通部運輸研究所。 Li, C., Li, X., Chen, M., & Sun, X. (2023). Deep learning and image recognition. In 2023 IEEE 6th International Conference on Electronic Information and Communication Technology (ICEICT) (pp. 557–562). IEEE. https://doi.org/10.1109/ICEICT57916.2023.10245041 Teixeira, E., Araujo, B., Costa, V., Mafra, S., & Figueiredo, F. (2022). Literature Review on Ship Localization, Classification, and Detection Methods Based on Optical Sensors and Neural Networks. Sensors, 22(18), 6879. https://doi.org/10.3390/s22186879 Cheng, J.-Y., Tham, M.-L., Kwan, B.-H., & Wong, C. K. F. (2024). Performance evaluation of Vision Transformer and YOLOv8 in plant disease classification. 2024 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS) (pp. 334–339). IEEE. https://doi.org/10.1109/I2CACIS61270.2024.10649856 Wilson, A. N., Gupta, K. A., Koduru, B. H., Kumar, A., Jha, A., & Cenkeramaddi, L. R. (2023). Recent advances in thermal imaging and its applications using machine learning: A review. IEEE Sensors Journal, 23(4), 3395–3407. https://doi.org/10.1109/JSEN.2023.3234335 Farahnakian, F., & Heikkonen, J. (2020). Deep Learning Based Multi-Modal Fusion Architectures for Maritime Vessel Detection. Remote Sensing, 12(16), 2509. https://doi.org/10.3390/rs12162509 Gu, S., Zhang, X., & Zhang, J. (2023). A full-time deep learning-based alert approach for bridge–ship collision using visible spectrum and thermal infrared cameras. Measurement Science and Technology, 34(9). https://doi.org/10.1088/1361-6501/acd6ad Medić, D., Bakota, M., Jelaska, I., & Škorput, P. (2024). Low Contrast Challenge and Limitations of Thermal Drones in Maritime Search and Rescue—Pilot Study. Drones, 8(3), 76. https://doi.org/10.3390/drones8030076 Zhang, C., Zhang, X., Gao, G., Lang, H., Liu, G., & Cao, C. (2024). Development and application of ship detection and classification datasets: A review. IEEE Geoscience and Remote Sensing Magazine, 12(4), 12–45. https://doi.org/10.1109/MGRS.2024.3450681 Shao, Z., Wu, W., Wang, Z., Du, W., & Li, C. (2018). SeaShips: A large-scale precisely annotated dataset for ship detection. IEEE Transactions on Multimedia, 20(10), 2593–2604. https://doi.org/10.1109/TMM.2018.2865686 Shao, Z., Wang, Y., Wang, J., Deng, L., Huang, X., Lu, T., Luo, F., & Zhang, R. (2024). GLSD: A global large-scale ship database with baseline evaluations. Geo-Spatial Information Science, 28(4), 1566–1580. https://doi.org/10.1080/10095020.2024.2416896 Petković, M., Vujović, I., Lušić, Z., & Šoda, J. (2023). Image Dataset for Neural Network Performance Estimation with Application to Maritime Ports. Journal of Marine Science and Engineering, 11(3), 578. https://doi.org/10.3390/jmse11030578 Zhang, Z., Zhang, L., Wang, Y., Feng, P., & He, R. (2021). ShipRSImageNet: A large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 8458–8472. https://doi.org/10.1109/JSTARS.2021.3104230 Yao, L., Zhang, X., Lü, Y., Sun, W., & Li, M. (2021). FGSC-23: A large-scale dataset of high-resolution optical remote sensing images for deep learning-based fine-grained ship recognition. Journal of Image and Graphics. https://doi.org/10.11834/jig.200261 Patino, L., Cane, T., & Ferryman, J. (2021). A comprehensive maritime benchmark dataset for detection, tracking and threat recognition. In 2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (pp. 1–8). IEEE. https://doi.org/10.1109/AVSS52988.2021.9663739 Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., Dutta, A. K., & Almotairi, S. (2022). A Comparison of Pooling Methods for Convolutional Neural Networks. Applied Sciences, 12(17), 8643. https://doi.org/10.3390/app12178643 Taye, M. M. (2023). Theoretical Understanding of Convolutional Neural Network: Concepts, Architectures, Applications, Future Directions. Computation, 11(3), 52. https://doi.org/10.3390/computation11030052 Julianto, E., Khumaidi, A., Priyonggo, P., Rahmat, M. B., Sarena, S., Adhitya, R., Herijono, B., Suharjito, G., & Munadhif, I. (2020). Object recognition on patrol ship using image processing and convolutional neural network (CNN). Journal of Physics: Conference Series, 1450(1), 012081. https://doi.org/10.1088/1742-6596/1450/1/012081 Jocher, G. (2023). YOLOv8 [GitHub repository]. https://github.com/ultralytics/ultralytics Dinh, H. T., & Kim, E.-T. (2025). A Lightweight Network Based on YOLOv8 for Improving Detection Performance and the Speed of Thermal Image Processing. Electronics, 14(4), 783. https://doi.org/10.3390/electronics14040783 Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16×16 words: Transformers for image recognition at scale. arXiv. https://doi.org/10.48550/arXiv.2010.11929 Naseer, M., Ranasinghe, K., Khan, S. H., Hayat, M., Khan, F. S., & Yang, M.-H. (2021). Intriguing properties of vision transformers. arXiv. https://doi.org/10.48550/arXiv.2105.10497 Kapoor, A., & Sinha, B. B. (2024). Object detection in maritime surveillance: A data-centric comparison of YOLO and transformer-based models. In 2024 IEEE 8th International Conference on Information and Communication Technology (CICT) (pp. 1–6). IEEE. https://doi.org/10.1109/CICT64037.2024.10899718 Tharwat, A. (2021). Classification assessment methods. Applied Computing and Informatics, 17(1), 168–192. https://doi.org/10.1016/j.aci.2018.08.003 Côté, P.-O., Nikanjam, A., Ahmed, N., Humeniuk, D., & Khomh, F. (2023). Data cleaning and machine learning: A systematic literature review. Automated Software Engineering, 31, 54. https://doi.org/10.1007/s10515-024-00453-w Xiao, A., Yu, W., & Yu, H. (2025). Sample-aware RandAugment: Search-free automatic data augmentation for effective image recognition. International Journal of Computer Vision. https://doi.org/10.1007/s11263-025-02536-x Rácz, A., Bajusz, D., & Héberger, K. (2021). Effect of Dataset Size and Train/Test Split Ratios in QSAR/QSPR Multiclass Classification. Molecules, 26(4), 1111. https://doi.org/10.3390/molecules26041111 Bichri, H., Chergui, A., & Hain, M. (2024). Investigating the impact of train/test split ratio on the performance of pre-trained models with custom datasets. International Journal of Advanced Computer Science and Applications (IJACSA), 15(2). http://dx.doi.org/10.14569/IJACSA.2024.0150235 Zhao, X., Liang, J., & Dang, C. (2019). A stratified sampling based clustering algorithm for large-scale data. Knowledge-Based Systems, 163, 416–428. https://doi.org/10.1016/j.knosys.2018.09.007 Liu, H., & Cocea, M. (2017). Semi-random partitioning of data into training and test sets in granular computing context. Granular Computing, 2, 357–386. https://doi.org/10.1007/s41066-017-0049-2	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101058	-
dc.description.abstract	本研究針對船舶熱成像影像分類任務，探討不同深度學習模型與資料分割策略對分類效能之影響。由於熱成像影像能在夜間或低能見度環境下提供穩定的觀測能力，對於海上監測與智慧航行具有重要應用價值。然而，現有的相關研究與資料集仍相對不足。為此，本研究建置一套涵蓋貨船、渡輪、遊艇及其他船舶四個類別之熱成像資料集，透過資料處理、分類標註與分層抽樣，共獲得4,834張有效影像，並採用兩種資料集分割策略進行模型訓練與測試。實驗選用基於卷積神經網路架構的YOLOv8-cls模型，以及基於Transformer架構的ViT模型進行比較。結果顯示，在本研究所使用之資料規模與特性下，YOLOv8-cls在收斂速度與分類穩定性方面呈現較佳表現；而ViT模型對資料量與多樣性較為敏感，當資料有限時其優勢較不明顯，需在資料量或影像變化更充足的情況下，才有機會充分展現其全域特徵建模能力。此結果顯示不同模型在不同資料情境下的適用性差異。本研究提供完整的資料集建置流程與模型效能比較，建立可重複、可擴充的實證資源，為後續船舶熱成像影像分類及相關應用研究提供參考。未來可進一步朝向多模態資料融合、混合式架構設計與智慧航行應用等方向發展。	zh_TW
dc.description.abstract	This study investigates the influence of different deep learning model architectures and dataset partitioning strategies on ship thermal image classification performance. Thermal imaging provides stable observation capabilities under nighttime or low-visibility conditions, making it invaluable for maritime surveillance and intelligent navigation. However, existing research and publicly available datasets remain limited, restricting the development of deep learning applications in this domain. To address this gap, this study constructs a dedicated thermal image dataset comprising four vessel categories: cargo ship, ferry, yacht, and other ship. Through rigorous data cleaning, classification annotation, and stratified sampling, a total of 4,834 valid images were obtained and used for model training and evaluation under two dataset partitioning strategies. Two representative architectures were selected for comparison: the convolutional neural network-based YOLOv8-cls model and the self-attention-based Vision Transformer (ViT) model. Experimental results show that, under the data scale and characteristics used in this study, YOLOv8-cls demonstrates faster convergence and more stable classification performance. In contrast, the ViT model is more sensitive to dataset size and diversity; when data are limited, its advantages are less pronounced and may only emerge when larger or more varied datasets are available, enabling its global feature modeling capabilities. These findings suggest that the applicability of different models varies depending on data conditions, rather than indicating absolute performance superiority. This research provides a comprehensive dataset construction process and model performance comparison, establishing a reproducible and extensible empirical resource for future studies. The findings can serve as a reference for subsequent research in ship thermal image classification and related applications. Future work may further explore multimodal data fusion, hybrid model architectures, and applications in intelligent maritime navigation.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-11-27T16:05:43Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-11-27T16:05:43Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii ABSTRACT iii 目次 v 圖次 vii 表次 ix Chapter 1 緒論 1 1.1 研究背景 1 1.2 研究動機 2 1.3 論文架構 4 Chapter 2 文獻探討 5 2.1 船舶熱成像技術 5 2.2 船舶影像資料集 6 2.3 深度學習技術 7 2.3.1 卷積神經網路 7 2.3.2 YOLOv8-cls 9 2.3.3 Vision Transformer 11 2.3.4 模型評估指標 13 Chapter 3 研究方法 17 3.1 資料集準備 18 3.1.1 人工資料清洗 18 3.1.2 影像分類標註與預處理 20 3.2 模型建置與訓練 24 3.2.1 開發環境建立 24 3.2.2 訓練模型建置與超參數設定 25 3.3 資料集分割策略 27 Chapter 4 研究結果與分析 31 4.1 YOLOv8-cls模型分析 31 4.1.1 分割策略A（7:1:2） 31 4.1.2 分割策略B（8:1:1） 33 4.1.3 YOLOv8-cls模型資料分割策略選擇 36 4.2 ViT模型分析 36 4.2.1 分割策略A（7:1:2） 36 4.2.2 分割策略B（8:1:1） 39 4.2.3 ViT模型資料分割策略選擇 42 4.3 模型綜合比較與效能分析 42 Chapter 5 結論與未來展望 44 5.1 結論 44 5.2 未來展望 45 參考文獻 47	-
dc.language.iso	zh_TW	-
dc.subject	船舶分類	-
dc.subject	熱成像	-
dc.subject	深度學習	-
dc.subject	資料分割	-
dc.subject	Ship Classification	-
dc.subject	Thermal Imagery	-
dc.subject	Deep Learning	-
dc.subject	Data Partition	-
dc.title	船舶熱成像分類之深度學習模型與資料集分割策略	zh_TW
dc.title	Deep Learning Models and Dataset Split Strategy for Ship Classification in Thermal Images	en
dc.type	Thesis	-
dc.date.schoolyear	114-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	張瑞益;陳建璋;李子宜;鄭信鴻	zh_TW
dc.contributor.oralexamcommittee	Ray-I Chang;Chien-Chang Chen;Zi-Yi Lee;Hsin-Hung Cheng	en
dc.subject.keyword	船舶分類,熱成像深度學習資料分割	zh_TW
dc.subject.keyword	Ship Classification,Thermal ImageryDeep LearningData Partition	en
dc.relation.page	51	-
dc.identifier.doi	10.6342/NTU202504682	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-11-18	-
dc.contributor.author-college	工學院	-
dc.contributor.author-dept	工程科學及海洋工程學系	-
dc.date.embargo-lift	2028-11-07	-
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 此日期後於網路公開 2028-11-07	2.05 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。