Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101014
Title: 透過標籤分組與專屬模型提升不平衡多標籤影像分類表現
Improving Imbalanced Multi-Label Image Classification Performance via Label Grouping and Dedicated Models
Authors: 賴冠瑜
Guan-Yu Lai
Advisor: 周承復
Cheng-Fu Chou
Keyword: 深度學習,多標籤模型多標籤分類醫學眼底影像不平衡多標籤數據集疾病分類圖片標記
Deep Learning,Multi-Label ModelMulti-Label ClassificationMedical Fundus ImagesImbalanced Multi-label DatasetsDisease ClassificationImage Tagging
Publication Year : 2025
Degree: 碩士
Abstract: 本研究針對多標籤影像分類中的標籤不平衡問題提出解法。在此問題中,不 同標籤的出現頻率差異極大,導致模型偏向多數標籤,忽略重要的稀有標籤,例 如在影像標註與醫療診斷等應用中。現有方法多採用損失重加權、資料重取樣或 視覺-語言模型來改善稀有標籤表現,但可能需要大量超參數調整,或依賴額外訓 練資料,且未必適用於特定領域。
為了提高不平衡多標籤影像分類的表現,本研究提有別於以往的解決方法, 我們將標籤依頻率分組並為每組訓練專屬模型,以提升模型訓練的穩定性與專注 度,同時保留標籤關聯資訊。我們設計了兩個演算法:一是動態規劃分組演算法, 確保分組內部平衡並最小化分組數;二是輔助標籤擴展演算法,以平衡指標引入 輔助標籤一起訓練。本方法於 COCO-MLT、VOC-MLT 及 MuReD 三個資料集上 驗證,並在不同分佈標籤表現皆顯著提升。
This study addresses the problem of label imbalance in multi-label image classification. In such tasks, the frequency of label occurrences varies significantly, causing models to favor frequent labels while neglecting important rare ones, a common challenge in applications such as image tagging and medical diagnosis. Existing methods often rely on loss reweighting, data resampling, or vision-language models to improve the performance on rare labels. However, these approaches typically require extensive hyperparameter tuning, additional training data, or may not generalize well to specific domains.
To enhance performance in imbalanced multi-label image classification, we propose a novel approach that groups labels based on their frequencies and trains dedicated models for each group. This strategy improves training stability and allows the model to focus more effectively while preserving inter-label dependencies. We introduce two algorithms: (1) a dynamic programming-based label grouping algorithm that ensures intra-group balance and minimizes the number of groups, and (2) an auxiliary label expansion algorithm that incorporates additional labels during training based on a balancing metric.
Our method is evaluated on three datasets, including COCO-MLT, VOC-MLT, and MuReD. Experiments show that our method significantly improves performance across head, mid-frequency, and tail labels.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101014
DOI: 10.6342/NTU202504480
Fulltext Rights: 未授權
metadata.dc.date.embargo-lift: N/A
Appears in Collections:資訊網路與多媒體研究所

Files in This Item:
File SizeFormat 
ntu-114-1.pdf
  Restricted Access
4.89 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved