透過標籤分組與專屬模型提升不平衡多標籤影像分類表現

賴冠瑜; Guan-Yu Lai

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101014

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	周承復	zh_TW
dc.contributor.advisor	Cheng-Fu Chou	en
dc.contributor.author	賴冠瑜	zh_TW
dc.contributor.author	Guan-Yu Lai	en
dc.date.accessioned	2025-11-26T16:28:32Z	-
dc.date.available	2025-11-27	-
dc.date.copyright	2025-11-26	-
dc.date.issued	2025	-
dc.date.submitted	2025-09-19	-
dc.identifier.citation	M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771, 2004. F. Charte, A. Rivera, M. J. del Jesus, and F. Herrera. A first approach to deal with imbalance in multi-label datasets. In J.-S. Pan, M. M. Polycarpou, M. Woźniak, A. C. P. L. F. de Carvalho, H. Quintián, and E. Corchado, editors, Hybrid Artificial Intelligent Systems, pages 150–160, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg. F. Charte, A. Rivera, M. J. del Jesus, and F. Herrera. Resampling multilabel datasets by decoupling highly imbalanced labels. In E. Onieva, I. Santos, E. Osaba, H. Quintián, and E. Corchado, editors, Hybrid Artificial Intelligent Systems, pages 489–501, Cham, 2015. Springer International Publishing. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Mlenn: A first approach to heuristic multilabel undersampling. In E. Corchado, J. A. Lozano, H. Quintián, and H. Yin, editors, Intelligent Data Engineering and Automated Learning – IDEAL 2014, pages 1–9, Cham, 2014. Springer International Publishing. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Addressing imbalance in multilabel classification: Measures and random resampling algorithms. Neurocomputing, 163:3–16, 2015. Recent Advancements in Hybrid Artificial Intelligence Systems and its Application to Real-World Problems Progress in Intelligent Systems Mining Humanistic Data. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Mlsmote: Approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems, 89:385–397, 2015. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Remedial-hwr: Tackling multilabel imbalance through label decoupling and data resampling hybridization. Neurocomputing, 326-327:110–122, 2019. L. Chen, Y. Wang, and H. Li. Enhancement of dnn-based multilabel classification by grouping labels based on data imbalance and label correlation. Pattern Recognition, 132:108964, 2022. Y. Cui, M. Jia, T.-Y. Lin, Y. Song, and S. Belongie. Class-balanced loss based on effective number of samples. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9260–9269, 2019. J. Duan, X. Yang, S. Gao, and H. Yu. A partition-based problem transformation algorithm for classifying imbalanced multi-label data. Engineering Applications of Artificial Intelligence, 128:107506, 2024. M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html, 2012. D. J. J. Farnell, M. A. Gegundez-Arias, R. R. A. Pech-Pacheco, A. M. Bravo, and M. A. Hillman. Enhancement of blood vessels in digital fundus photographs via the application of multiscale line operators. The Journal of the Franklin Institute, 345(7):748–765, 2008. S. Godbole and S. Sarawagi. Discriminative methods for multi-labeled classification. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 22–30. Springer, 2004. P. Goyal, P. Dollár, R. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y. Jia, and K. He. Accurate, large minibatch sgd: Training imagenet in 1 hour. 06 2017. H. Guo and S. Wang. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15089–15098, June 2021. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. A. Hoover, V. Kouznetsova, and M. Goldbaum. Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Transactions on Medical Imaging, 19(3):203–210, 2000. C. Huang, Y. Li, C. C. Loy, and X. Tang. Learning deep representation for imbalanced classification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5375–5384, 2016. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2017. T.-C. Lin. Improved accuracy of imbalanced multi-label classification in medical fundus images. Master’s thesis, National Taiwan University, 2024. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2999–3007, 2017. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, Computer Vision – ECCV 2014, pages 740–755, Cham, 2014. Springer International Publishing. S. Pachade, R. Mehta, J. Rathee, P. K. Shukla, H. Math, D. Mishra, and V. Chandra. Retinal fundus multi-disease image dataset (rfmid): A dataset for multi-disease detection research. Data, 6(2):14, 2021. R. M. Pereira, Y. M. Costa, and C. N. Silla Jr. Mltl: A multi-label approach for the tomek link undersampling algorithm. Neurocomputing, 383:95–105, 2020. J. Read, B. Pfahringer, G. Holmes, and E. Frank. Classifier chains for multi-label classification. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), pages 254–269. Springer, 2009. T. Ridnik, E. Ben-Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, and L. Zelnik-Manor. Asymmetric loss for multi-label classification. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 82–91, 2021. M. A. Rodriguez Rivera, H. Al-Marzouqi, and P. Liatsis. Multi-Label Retinal Diseases (MuReD) Dataset. https://doi.org/10.17632/pc4mb3h8hz.1, 2022. L. Shen, Z. Lin, and Q. Huang. Relay backpropagation for effective learning of deep convolutional neural networks. In B. Leibe, J. Matas, N. Sebe, and M. Welling, editors, Computer Vision – ECCV 2016, pages 467–482, Cham, 2016. Springer International Publishing. L. N. Smith and N. Topin. Super-convergence: Very fast training of neural networks using large learning rates, 2018. I. Tomek. Two modifications of cnn. IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(11):769–772, 1976. G. Tsoumakas, I. Katakis, and I. Vlahavas. Random k-labelsets for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 23(7):1079–1089, 2011. Y.-X. Wang, D. Ramanan, and M. Hebert. Learning to model the tail. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. T. Wu, Q. Huang, Z. Liu, Y. Wang, and D. Lin. Distribution-balanced loss for multi-label classification in long-tailed datasets. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV, pages 162–178, Berlin, Heidelberg, 2020. Springer-Verlag. P. Xia, D. Xu, M. Hu, L. Ju, and Z. Ge. LMPT: Prompt tuning with class-specific embedding loss for long-tailed multi-label visual recognition, 2024. M.-L. Zhang, Y.-K. Li, and X.-Y. Liu. Towards class-imbalance aware multi-label learning. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), pages 4041–4047, 2015. M.-L. Zhang and K. Zhang. Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 999–1007. ACM, 2010.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101014	-
dc.description.abstract	本研究針對多標籤影像分類中的標籤不平衡問題提出解法。在此問題中，不同標籤的出現頻率差異極大，導致模型偏向多數標籤，忽略重要的稀有標籤，例如在影像標註與醫療診斷等應用中。現有方法多採用損失重加權、資料重取樣或視覺-語言模型來改善稀有標籤表現，但可能需要大量超參數調整，或依賴額外訓練資料，且未必適用於特定領域。為了提高不平衡多標籤影像分類的表現，本研究提有別於以往的解決方法，我們將標籤依頻率分組並為每組訓練專屬模型，以提升模型訓練的穩定性與專注度，同時保留標籤關聯資訊。我們設計了兩個演算法:一是動態規劃分組演算法，確保分組內部平衡並最小化分組數;二是輔助標籤擴展演算法，以平衡指標引入輔助標籤一起訓練。本方法於 COCO-MLT、VOC-MLT 及 MuReD 三個資料集上驗證，並在不同分佈標籤表現皆顯著提升。	zh_TW
dc.description.abstract	This study addresses the problem of label imbalance in multi-label image classification. In such tasks, the frequency of label occurrences varies significantly, causing models to favor frequent labels while neglecting important rare ones, a common challenge in applications such as image tagging and medical diagnosis. Existing methods often rely on loss reweighting, data resampling, or vision-language models to improve the performance on rare labels. However, these approaches typically require extensive hyperparameter tuning, additional training data, or may not generalize well to specific domains. To enhance performance in imbalanced multi-label image classification, we propose a novel approach that groups labels based on their frequencies and trains dedicated models for each group. This strategy improves training stability and allows the model to focus more effectively while preserving inter-label dependencies. We introduce two algorithms: (1) a dynamic programming-based label grouping algorithm that ensures intra-group balance and minimizes the number of groups, and (2) an auxiliary label expansion algorithm that incorporates additional labels during training based on a balancing metric. Our method is evaluated on three datasets, including COCO-MLT, VOC-MLT, and MuReD. Experiments show that our method significantly improves performance across head, mid-frequency, and tail labels.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-11-26T16:28:32Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-11-26T16:28:32Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xv Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Resample methods 5 2.1.1 LP-ROS and LP-RUS 6 2.1.2 ML-ROS and ML-RUS 7 2.1.3 MLeNN: Multi-Label Edited Nearest Neighbors 7 2.1.4 REMEDIAL and REMEDIAL-HwR 8 2.1.5 MLTL 9 2.1.6 Classaware Sampling 9 2.2 Cost Sensitive methods 10 2.2.1 Class-level Re-weighting 10 2.2.2 Sample-level Re-weighting 11 2.2.3 Hybrid method 12 2.3 Ensemble Methods 12 2.4 Model Adaption 15 2.5 Label Imbalance and Correlation Metrics 16 2.5.1 IRLBL (Imbalance Ratio per Label) 17 2.5.2 meanIR (Mean Imbalance Ratio) 17 2.5.3 CVIR (Coefficient of Variation of IR) 17 2.5.4 SCUMBLE (Score of Concurrence among Imbalanced Labels) 18 2.6 Imbalanced Datasets 19 2.6.1 COCO-MLT 19 2.6.2 VOC-MLT 20 2.6.3 MuReD 20 Chapter 3 Method 27 3.1 Label Grouping Problem Definition 28 3.2 Dynamic Programming Formulation 30 3.3 Label Grouping Algorithm Implementation Detail 32 3.3.1 Step 1: Valid Group Identification 32 3.3.2 Step 2: Dynamic Programming for Optimal Partition 33 3.3.3 Step 3: Reconstructing Label Groups 33 3.3.4 Computational Complexity 34 3.4 Auxiliary Label Expansion for Enhanced Group Training 37 3.4.1 Algorithm Detail: Progressive Label Group Expansion Minimizing Imbalance 37 Chapter 4 Experiments 41 4.1 Evaluation 41 4.2 Comparison Methods 42 4.3 Implementation Detail 43 4.4 Result Discussion 44 4.4.1 Trade-off Between Generalization and Frequency Imbalance 45 4.4.2 Analysis of Benefit of Auxiliary Labels Across Models 46 4.4.3 Mitigating Noise Caused by Label Imbalance on Multi-Label Training 47 Chapter 5 Conclusion 49 References 51 Appendix A — Proof of Optimality of the Dynamic Programming Algorithm 57 Appendix B — Partition Result Details 59	-
dc.language.iso	en	-
dc.subject	深度學習	-
dc.subject	多標籤模型	-
dc.subject	多標籤分類	-
dc.subject	醫學眼底影像	-
dc.subject	不平衡多標籤數據集	-
dc.subject	疾病分類	-
dc.subject	圖片標記	-
dc.subject	Deep Learning	-
dc.subject	Multi-Label Model	-
dc.subject	Multi-Label Classification	-
dc.subject	Medical Fundus Images	-
dc.subject	Imbalanced Multi-label Datasets	-
dc.subject	Disease Classification	-
dc.subject	Image Tagging	-
dc.title	透過標籤分組與專屬模型提升不平衡多標籤影像分類表現	zh_TW
dc.title	Improving Imbalanced Multi-Label Image Classification Performance via Label Grouping and Dedicated Models	en
dc.type	Thesis	-
dc.date.schoolyear	114-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	吳曉光;蔡瑞煌;李明穗;陳駿丞	zh_TW
dc.contributor.oralexamcommittee	Hsiao-Kuang Wu;Rua-Huan Tsaih;Ming-Sui Lee;Jun-Cheng Chen	en
dc.subject.keyword	深度學習,多標籤模型多標籤分類醫學眼底影像不平衡多標籤數據集疾病分類圖片標記	zh_TW
dc.subject.keyword	Deep Learning,Multi-Label ModelMulti-Label ClassificationMedical Fundus ImagesImbalanced Multi-label DatasetsDisease ClassificationImage Tagging	en
dc.relation.page	60	-
dc.identifier.doi	10.6342/NTU202504480	-
dc.rights.note	未授權	-
dc.date.accepted	2025-09-19	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊網路與多媒體研究所	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 未授權公開取用	4.89 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。