請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97563| 標題: | SALIERI: 透過配對相似品質的回答系統性量測大型語 言模型評估者的自我增強偏誤 SALIERI: Systematic Assessment of Self-enhancement Bias in Large Language Model Evaluators by Pairing Similar Quality Response |
| 作者: | 林鴻儒 Hung-Ju Lin |
| 指導教授: | 許永真 Jane Yung-jen Hsu |
| 共同指導教授: | 陳縕儂 Yun-Nung Chen |
| 關鍵字: | 大型語言模型用評估,大型語言模型,自我提升偏誤, LLM-as-a-judge,Large Language Model,Self-Enhancement Bias, |
| 出版年 : | 2024 |
| 學位: | 碩士 |
| 摘要: | 自我提升偏誤(self-enhancement bias)是指模型傾向於對其自身生成的回答給予過高評分的現象。然而,我們發現低品質回答往往會產生較高的偏誤分數,且這種現象在能力較低的模型中更為明顯。這反映了現有測量方法的一個重要局限性——容易受到回答品質和模型能力的干擾,導致結果不準確。
為了解決此問題,我們提出了一種新方法——SALIERI,透過配對品質相似的回答來消除這些干擾,從而量化真正的偏誤程度。實驗結果顯示,在Summeval資料集上,SALIERI將LLaMA 2 7B模型中高品質與低品質回答的偏誤分數差距從2.55降至0.75,並將能力較強的Llama 3 8B模型的差距從1.44降至0.69。這些結果表明,SALIERI能有效提高自我提升偏誤測量的準確性與可靠性。 Self-enhancement bias refers to the tendency of models to overrate their own responses. However, we found that lower-quality responses tend to produce higher bias scores, an issue exacerbated in less capable models. This highlights a key limitation of current measurement methods, which are confounded by response quality and model capability, leading to inaccuracies. To address this, we propose SALIERI, a method that pairs responses of similar quality to isolate true bias. Experiments on the Summeval dataset show that SALIERI reduced bias score gaps between high- and low-quality responses from 2.55 to 0.75 for the less capable LLaMA 2 7B model and from 1.44 to 0.69 for the more advanced Llama 3 8B model. These results demonstrate SALIERI’s effectiveness in achieving more accurate and reliable measurements of self-enhancement bias. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97563 |
| DOI: | 10.6342/NTU202500980 |
| 全文授權: | 同意授權(全球公開) |
| 電子全文公開日期: | 2025-07-03 |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf | 564.36 kB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
