Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87177
Title: LaSER: 基於分布預測與權重調整改善不平衡資料下之半監督式學習之框架
LaSER: Improving Class-Imbalanced Semi-Supervised Learning with Label Shift Estimation and Reweighting
Authors: 黃品硯
Pin-Yen Huang
Advisor: 項潔
Jieh Hsiang
Keyword: 機器學習,半監督式學習,不平衡學習,標註分佈預測,影像分類,不平衡半監督式學習,
machine learning,semi-supervised learning,imbalanced learning,label shift estimation,image classification,class-imbalanced semi-supervised learning,
Publication Year : 2023
Degree: 碩士
Abstract: 傳統半監督學習(semi-supervised learning, SSL)的方法假設訓練資料的類別是平均分佈的,也就是說每個類別的訓練資料數量是一樣的。然而,在現實世界的資料中,多數的資料類別是不平均分佈的。這對於傳統的 SSL 演算法是一個重大挑戰,它們在這種情況下通常表現不佳,會嚴重傾向於預測訓練資料較多的類別。為了解決這個問題,有一種研究領域在探討類別不平衡資料下的半監督學習(class-imbalanced semi-supervised learning, CISSL),讓 SSL 演算法可以減少受到不平衡資料造成的影響。我們發現在現有 CISSL 的研究中有兩種方向: (1)提高僞標籤(pseudo-label)的準確度, (2)結合 SSL 與不平衡學習(class-imbalanced learning)。這兩種研究方向解決了不同面向的問題。在本論文中,我們提出了一種結合這兩種流派的新方法,我們的方法分別結合了 DARP 和 Mixup-DRW 到現有的 SSL 演算法中。此外,我們改進了在不平衡資料下的標註分佈預測(label shift estimation, LSE),更進一步在各種環境、設定下提高了 SSL 性能和穩定性。
The field of semi-supervised learning (SSL) has traditionally relied on the assumption that the class distribution of training data is evenly distributed. However, real-world datasets often have imbalanced or long-tailed distributions. This poses a significant challenge for traditional SSL, as they tend to exhibit poor performance in such conditions. To address this problem, a variant of SSL known as class-imbalanced semi-supervised learning (CISSL) has been introduced. CISSL is specifically designed to be more robust against imbalanced data. We found there are two approaches in existing works of CISSL: (1) enhancing the quality of pseudo-labels, and (2) adapting imbalanced learning techniques to SSL. The two approaches address different aspects of the problem. In this thesis, we propose a novel method that combines two approaches, namely DARP and Mixup-DRW. Additionally, we improve the existing label shift estimation (LSE) in CISSL settings. Resulting in enhanced performance and robustness of SSL under various conditions.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87177
DOI: 10.6342/NTU202300472
Fulltext Rights: 未授權
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-111-1.pdf
  Restricted Access
2.16 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved