請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86507| 標題: | 利用未標記資料改善分類公平性之後處理演算法 Improving Fair Classification by Leveraging Unlabeled Data in Post-processing |
| 作者: | Pei-Chi Wang 王珮綺 |
| 指導教授: | 陳尚澤(Shang-Tse Chen) |
| 關鍵字: | 機器學習,二元分類問題,公平性,條件公平性,未標記資料,後處理, machine learning,binary classification,fairness,conditional fairness,unlabeled data,post-processing, |
| 出版年 : | 2022 |
| 學位: | 碩士 |
| 摘要: | 近年來,具有公平意識的機器學習演算法獲得廣泛的討論,但多數研究侷限在特定的設定。前人提出的方法只能應用在特定的模型架構,亦或是限於二元編碼的敏感屬性。本文研究二元分類問題之公平性,並提出一個後處理演算法。該演算法承接後處理適用所有模型架構的優勢,並應用於多類別的敏感屬性。我們考慮一個更廣泛的公平性定義,其涵蓋數個目前廣泛使用的公平性指標。利用未標記資料提供的資訊,本文提出的演算法調整各組的閾值藉以達到特定的公平性定義。我們設計一個代理函數將此問題的最佳化從網格搜尋取代成基於梯度的迭代演算法,以避免敏感屬性增加或公平性指標的變化對於網格搜尋帶來的影響。本文提出的方法享有嚴格的理論保證,並且在數值實驗的準確度和公平性之間取得更好的平衡。 There has been wide discussion on fairness-aware machine learning algorithms, but most of them are under limited settings. Prior works are either applied to a certain type of model or restricted to a binary sensitive group. We study the fair binary classification problem and propose a post-processing method that simultaneously applies to any classification model, data with any number of groups induced by a sensitive attribute, and a more generic definition of fairness that covers some widely-used fairness metrics. By utilizing unlabeled data, the proposed model estimates a group-dependent threshold to satisfy the given fairness notion. We demonstrate that a surrogate function makes the threshold-finding problem solved by gradient-based optimization. Our approach enjoys rigorous theoretical guarantees and often gets a better trade-off between accuracy and fairness in numerical experiments. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86507 |
| DOI: | 10.6342/NTU202202340 |
| 全文授權: | 同意授權(全球公開) |
| 電子全文公開日期: | 2023-09-01 |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1208202213404100.pdf | 1.04 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
