SALIERI: 透過配對相似品質的回答系統性量測大型語 言模型評估者的自我增強偏誤

林鴻儒; Hung-Ju Lin

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97563

Title:	SALIERI: 透過配對相似品質的回答系統性量測大型語言模型評估者的自我增強偏誤 SALIERI: Systematic Assessment of Self-enhancement Bias in Large Language Model Evaluators by Pairing Similar Quality Response
Authors:	林鴻儒 Hung-Ju Lin
Advisor:	許永真 Jane Yung-jen Hsu
Co-Advisor:	陳縕儂 Yun-Nung Chen
Keyword:	大型語言模型用評估,大型語言模型,自我提升偏誤, LLM-as-a-judge,Large Language Model,Self-Enhancement Bias,
Publication Year :	2024
Degree:	碩士
Abstract:	自我提升偏誤(self-enhancement bias)是指模型傾向於對其自身生成的回答給予過高評分的現象。然而，我們發現低品質回答往往會產生較高的偏誤分數，且這種現象在能力較低的模型中更為明顯。這反映了現有測量方法的一個重要局限性——容易受到回答品質和模型能力的干擾，導致結果不準確。為了解決此問題，我們提出了一種新方法——SALIERI，透過配對品質相似的回答來消除這些干擾，從而量化真正的偏誤程度。實驗結果顯示，在Summeval資料集上，SALIERI將LLaMA 2 7B模型中高品質與低品質回答的偏誤分數差距從2.55降至0.75，並將能力較強的Llama 3 8B模型的差距從1.44降至0.69。這些結果表明，SALIERI能有效提高自我提升偏誤測量的準確性與可靠性。 Self-enhancement bias refers to the tendency of models to overrate their own responses. However, we found that lower-quality responses tend to produce higher bias scores, an issue exacerbated in less capable models. This highlights a key limitation of current measurement methods, which are confounded by response quality and model capability, leading to inaccuracies. To address this, we propose SALIERI, a method that pairs responses of similar quality to isolate true bias. Experiments on the Summeval dataset show that SALIERI reduced bias score gaps between high- and low-quality responses from 2.55 to 0.75 for the less capable LLaMA 2 7B model and from 1.44 to 0.69 for the more advanced Llama 3 8B model. These results demonstrate SALIERI’s effectiveness in achieving more accurate and reliable measurements of self-enhancement bias.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97563
DOI:	10.6342/NTU202500980
Fulltext Rights:	同意授權(全球公開)
metadata.dc.date.embargo-lift:	2025-07-03
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-113-2.pdf	564.36 kB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets