Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101014
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor周承復zh_TW
dc.contributor.advisorCheng-Fu Chouen
dc.contributor.author賴冠瑜zh_TW
dc.contributor.authorGuan-Yu Laien
dc.date.accessioned2025-11-26T16:28:32Z-
dc.date.available2025-11-27-
dc.date.copyright2025-11-26-
dc.date.issued2025-
dc.date.submitted2025-09-19-
dc.identifier.citationM. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771, 2004.
F. Charte, A. Rivera, M. J. del Jesus, and F. Herrera. A first approach to deal with imbalance in multi-label datasets. In J.-S. Pan, M. M. Polycarpou, M. Woźniak, A. C. P. L. F. de Carvalho, H. Quintián, and E. Corchado, editors, Hybrid Artificial Intelligent Systems, pages 150–160, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
F. Charte, A. Rivera, M. J. del Jesus, and F. Herrera. Resampling multilabel datasets by decoupling highly imbalanced labels. In E. Onieva, I. Santos, E. Osaba, H. Quintián, and E. Corchado, editors, Hybrid Artificial Intelligent Systems, pages 489–501, Cham, 2015. Springer International Publishing.
F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Mlenn: A first approach to heuristic multilabel undersampling. In E. Corchado, J. A. Lozano, H. Quintián, and H. Yin, editors, Intelligent Data Engineering and Automated Learning – IDEAL 2014, pages 1–9, Cham, 2014. Springer International Publishing.
F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Addressing imbalance in multilabel classification: Measures and random resampling algorithms. Neurocomputing, 163:3–16, 2015. Recent Advancements in Hybrid Artificial Intelligence Systems and its Application to Real-World Problems Progress in Intelligent Systems Mining Humanistic Data.
F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Mlsmote: Approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems, 89:385–397, 2015.
F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Remedial-hwr: Tackling multilabel imbalance through label decoupling and data resampling hybridization. Neurocomputing, 326-327:110–122, 2019.
L. Chen, Y. Wang, and H. Li. Enhancement of dnn-based multilabel classification by grouping labels based on data imbalance and label correlation. Pattern Recognition, 132:108964, 2022.
Y. Cui, M. Jia, T.-Y. Lin, Y. Song, and S. Belongie. Class-balanced loss based on effective number of samples. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9260–9269, 2019.
J. Duan, X. Yang, S. Gao, and H. Yu. A partition-based problem transformation algorithm for classifying imbalanced multi-label data. Engineering Applications of Artificial Intelligence, 128:107506, 2024.
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html, 2012.
D. J. J. Farnell, M. A. Gegundez-Arias, R. R. A. Pech-Pacheco, A. M. Bravo, and M. A. Hillman. Enhancement of blood vessels in digital fundus photographs via the application of multiscale line operators. The Journal of the Franklin Institute, 345(7):748–765, 2008.
S. Godbole and S. Sarawagi. Discriminative methods for multi-labeled classification. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 22–30. Springer, 2004.
P. Goyal, P. Dollár, R. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y. Jia, and K. He. Accurate, large minibatch sgd: Training imagenet in 1 hour. 06 2017.
H. Guo and S. Wang. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15089–15098, June 2021.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
A. Hoover, V. Kouznetsova, and M. Goldbaum. Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Transactions on Medical Imaging, 19(3):203–210, 2000.
C. Huang, Y. Li, C. C. Loy, and X. Tang. Learning deep representation for imbalanced classification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5375–5384, 2016.
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2017.
T.-C. Lin. Improved accuracy of imbalanced multi-label classification in medical fundus images. Master’s thesis, National Taiwan University, 2024.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2999–3007, 2017.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, Computer Vision – ECCV 2014, pages 740–755, Cham, 2014. Springer International Publishing.
S. Pachade, R. Mehta, J. Rathee, P. K. Shukla, H. Math, D. Mishra, and V. Chandra. Retinal fundus multi-disease image dataset (rfmid): A dataset for multi-disease detection research. Data, 6(2):14, 2021.
R. M. Pereira, Y. M. Costa, and C. N. Silla Jr. Mltl: A multi-label approach for the tomek link undersampling algorithm. Neurocomputing, 383:95–105, 2020.
J. Read, B. Pfahringer, G. Holmes, and E. Frank. Classifier chains for multi-label classification. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), pages 254–269. Springer, 2009.
T. Ridnik, E. Ben-Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, and L. Zelnik-Manor. Asymmetric loss for multi-label classification. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 82–91, 2021.
M. A. Rodriguez Rivera, H. Al-Marzouqi, and P. Liatsis. Multi-Label Retinal Diseases (MuReD) Dataset. https://doi.org/10.17632/pc4mb3h8hz.1, 2022.
L. Shen, Z. Lin, and Q. Huang. Relay backpropagation for effective learning of deep convolutional neural networks. In B. Leibe, J. Matas, N. Sebe, and M. Welling, editors, Computer Vision – ECCV 2016, pages 467–482, Cham, 2016. Springer International Publishing.
L. N. Smith and N. Topin. Super-convergence: Very fast training of neural networks using large learning rates, 2018.
I. Tomek. Two modifications of cnn. IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(11):769–772, 1976.
G. Tsoumakas, I. Katakis, and I. Vlahavas. Random k-labelsets for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 23(7):1079–1089, 2011.
Y.-X. Wang, D. Ramanan, and M. Hebert. Learning to model the tail. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
T. Wu, Q. Huang, Z. Liu, Y. Wang, and D. Lin. Distribution-balanced loss for multi-label classification in long-tailed datasets. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV, pages 162–178, Berlin, Heidelberg, 2020. Springer-Verlag.
P. Xia, D. Xu, M. Hu, L. Ju, and Z. Ge. LMPT: Prompt tuning with class-specific embedding loss for long-tailed multi-label visual recognition, 2024.
M.-L. Zhang, Y.-K. Li, and X.-Y. Liu. Towards class-imbalance aware multi-label learning. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), pages 4041–4047, 2015.
M.-L. Zhang and K. Zhang. Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 999–1007. ACM, 2010.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101014-
dc.description.abstract本研究針對多標籤影像分類中的標籤不平衡問題提出解法。在此問題中,不 同標籤的出現頻率差異極大,導致模型偏向多數標籤,忽略重要的稀有標籤,例 如在影像標註與醫療診斷等應用中。現有方法多採用損失重加權、資料重取樣或 視覺-語言模型來改善稀有標籤表現,但可能需要大量超參數調整,或依賴額外訓 練資料,且未必適用於特定領域。
為了提高不平衡多標籤影像分類的表現,本研究提有別於以往的解決方法, 我們將標籤依頻率分組並為每組訓練專屬模型,以提升模型訓練的穩定性與專注 度,同時保留標籤關聯資訊。我們設計了兩個演算法:一是動態規劃分組演算法, 確保分組內部平衡並最小化分組數;二是輔助標籤擴展演算法,以平衡指標引入 輔助標籤一起訓練。本方法於 COCO-MLT、VOC-MLT 及 MuReD 三個資料集上 驗證,並在不同分佈標籤表現皆顯著提升。
zh_TW
dc.description.abstractThis study addresses the problem of label imbalance in multi-label image classification. In such tasks, the frequency of label occurrences varies significantly, causing models to favor frequent labels while neglecting important rare ones, a common challenge in applications such as image tagging and medical diagnosis. Existing methods often rely on loss reweighting, data resampling, or vision-language models to improve the performance on rare labels. However, these approaches typically require extensive hyperparameter tuning, additional training data, or may not generalize well to specific domains.
To enhance performance in imbalanced multi-label image classification, we propose a novel approach that groups labels based on their frequencies and trains dedicated models for each group. This strategy improves training stability and allows the model to focus more effectively while preserving inter-label dependencies. We introduce two algorithms: (1) a dynamic programming-based label grouping algorithm that ensures intra-group balance and minimizes the number of groups, and (2) an auxiliary label expansion algorithm that incorporates additional labels during training based on a balancing metric.
Our method is evaluated on three datasets, including COCO-MLT, VOC-MLT, and MuReD. Experiments show that our method significantly improves performance across head, mid-frequency, and tail labels.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-11-26T16:28:32Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-11-26T16:28:32Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements iii
摘要 v
Abstract vii
Contents ix
List of Figures xiii
List of Tables xv
Chapter 1 Introduction 1
Chapter 2 Related Work 5
2.1 Resample methods 5
2.1.1 LP-ROS and LP-RUS 6
2.1.2 ML-ROS and ML-RUS 7
2.1.3 MLeNN: Multi-Label Edited Nearest Neighbors 7
2.1.4 REMEDIAL and REMEDIAL-HwR 8
2.1.5 MLTL 9
2.1.6 Classaware Sampling 9
2.2 Cost Sensitive methods 10
2.2.1 Class-level Re-weighting 10
2.2.2 Sample-level Re-weighting 11
2.2.3 Hybrid method 12
2.3 Ensemble Methods 12
2.4 Model Adaption 15
2.5 Label Imbalance and Correlation Metrics 16
2.5.1 IRLBL (Imbalance Ratio per Label) 17
2.5.2 meanIR (Mean Imbalance Ratio) 17
2.5.3 CVIR (Coefficient of Variation of IR) 17
2.5.4 SCUMBLE (Score of Concurrence among Imbalanced Labels) 18
2.6 Imbalanced Datasets 19
2.6.1 COCO-MLT 19
2.6.2 VOC-MLT 20
2.6.3 MuReD 20
Chapter 3 Method 27
3.1 Label Grouping Problem Definition 28
3.2 Dynamic Programming Formulation 30
3.3 Label Grouping Algorithm Implementation Detail 32
3.3.1 Step 1: Valid Group Identification 32
3.3.2 Step 2: Dynamic Programming for Optimal Partition 33
3.3.3 Step 3: Reconstructing Label Groups 33
3.3.4 Computational Complexity 34
3.4 Auxiliary Label Expansion for Enhanced Group Training 37
3.4.1 Algorithm Detail: Progressive Label Group Expansion Minimizing Imbalance 37
Chapter 4 Experiments 41
4.1 Evaluation 41
4.2 Comparison Methods 42
4.3 Implementation Detail 43
4.4 Result Discussion 44
4.4.1 Trade-off Between Generalization and Frequency Imbalance 45
4.4.2 Analysis of Benefit of Auxiliary Labels Across Models 46
4.4.3 Mitigating Noise Caused by Label Imbalance on Multi-Label Training 47
Chapter 5 Conclusion 49
References 51
Appendix A — Proof of Optimality of the Dynamic Programming Algorithm 57
Appendix B — Partition Result Details 59
-
dc.language.isoen-
dc.subject深度學習-
dc.subject多標籤模型-
dc.subject多標籤分類-
dc.subject醫學眼底影像-
dc.subject不平衡多標籤數據集-
dc.subject疾病分類-
dc.subject圖片標記-
dc.subjectDeep Learning-
dc.subjectMulti-Label Model-
dc.subjectMulti-Label Classification-
dc.subjectMedical Fundus Images-
dc.subjectImbalanced Multi-label Datasets-
dc.subjectDisease Classification-
dc.subjectImage Tagging-
dc.title透過標籤分組與專屬模型提升不平衡多標籤影像分類表現zh_TW
dc.titleImproving Imbalanced Multi-Label Image Classification Performance via Label Grouping and Dedicated Modelsen
dc.typeThesis-
dc.date.schoolyear114-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee吳曉光;蔡瑞煌;李明穗;陳駿丞zh_TW
dc.contributor.oralexamcommitteeHsiao-Kuang Wu;Rua-Huan Tsaih;Ming-Sui Lee;Jun-Cheng Chenen
dc.subject.keyword深度學習,多標籤模型多標籤分類醫學眼底影像不平衡多標籤數據集疾病分類圖片標記zh_TW
dc.subject.keywordDeep Learning,Multi-Label ModelMulti-Label ClassificationMedical Fundus ImagesImbalanced Multi-label DatasetsDisease ClassificationImage Taggingen
dc.relation.page60-
dc.identifier.doi10.6342/NTU202504480-
dc.rights.note未授權-
dc.date.accepted2025-09-19-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
dc.date.embargo-liftN/A-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-114-1.pdf
  未授權公開取用
4.89 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved