Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92215
Title: 基於挖掘未標記偏差衝突樣本和損失重新加權之半監督去偏差
Semi-supervised Debiasing via Unlabeled Bias-conflicting Samples Discovering and Loss Reweighting
Authors: 楊秉蒼
Bing-Cang Yang
Advisor: 陳銘憲
Ming-Syan Chen
Keyword: 去偏差,半監督學習,
Debias,Semi-supervised learning,
Publication Year : 2024
Degree: 碩士
Abstract: 神經網絡在訓練過程中往往會因為偏頗的訓練資料而導致準確率下降。現有的研究緩解這個問題的方法是利用偏差衝突樣本(bias-conflicting sample)──即不包含偏差特徵的樣本──來鼓勵模型學習任務相關特徵,從而提高無偏差測試環境下模型的表現。然而,這些方法在辨識偏差衝突樣本的過程需要使用標籤,而標籤工作昂貴且耗時。為解決此問題,我們提出了一種兩階段去偏差框架:基於偏差衝突分數之損失重新加權(Bias-conflicting Score for Loss Reweighting, BSLR),旨在通過利用少量標記資料來評估無標記樣本中的偏差衝突程度並以此進行去偏差。在第一階段,我們從無標記資料中檢測偏差衝突樣本。此階段首先在標記資料上訓練一個有偏差和一個無偏差的分類器,並通過這兩個分類器推論無標記資料。我們測量輸出之間的差異作為衡量偏差衝突程度的指標,稱為「偏差衝突分數」,因為較高的分數代表樣本的偏差程度較低。在第二階段,我們利用前一階段獲得的偏差衝突分數重新調整損失以在訓練無偏差的分類器的過程中強調偏差衝突樣本。實驗結果表明 BSLR 通過利用無標記資料的資訊,使其表現優於最先進的方法,特別是在具有較大偏差和較少標籤的資料集上。
Neural networks usually receive significant performance deterioration when trained on biased datasets. Existing research addresses this issue by utilizing samples without bias features (i.e., bias-conflicting samples) to encourage the model to learn task-relevant characteristics, thereby improving unbiased testing performance. However, they identify such samples heavily relying on label information, whereas data labeling is time-consuming. To address this issue, in this paper, we propose a two-stage debiasing framework, named Bias-conflicting Score for Loss Reweighting (BSLR), aiming to evaluate the degree of bias-conflicting in unlabeled samples for debiasing by leveraging only a small amount labeled data. In the first stage, we detect the bias-conflicting samples from unlabeled data. The process begins with pre-training a biased and a debiased classifier on the labeled data. Subsequently, the unlabeled data are inferenced through the two classifiers. We measure the difference between the outputs as a score, named Bias-conflicting Score, since a higher score indicates a lower bias degree within the sample. In the second stage, we reweight the loss to place more emphasis on bias-conflicting samples for retraining the debiased classifier by leveraging the Bias-conflicting Scores obtained from the previous stage. Experimental results show that BSLR exhibits superior performance over the state-of-the-art methods by incorporating information from unlabeled data, especially on the dataset with a large bias and few labels.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92215
DOI: 10.6342/NTU202400683
Fulltext Rights: 未授權
Appears in Collections:電機工程學系

Files in This Item:
File SizeFormat 
ntu-112-1.pdf
  Restricted Access
1.22 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved