Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97563| Title: | SALIERI: 透過配對相似品質的回答系統性量測大型語 言模型評估者的自我增強偏誤 SALIERI: Systematic Assessment of Self-enhancement Bias in Large Language Model Evaluators by Pairing Similar Quality Response |
| Authors: | 林鴻儒 Hung-Ju Lin |
| Advisor: | 許永真 Jane Yung-jen Hsu |
| Co-Advisor: | 陳縕儂 Yun-Nung Chen |
| Keyword: | 大型語言模型用評估,大型語言模型,自我提升偏誤, LLM-as-a-judge,Large Language Model,Self-Enhancement Bias, |
| Publication Year : | 2024 |
| Degree: | 碩士 |
| Abstract: | 自我提升偏誤(self-enhancement bias)是指模型傾向於對其自身生成的回答給予過高評分的現象。然而,我們發現低品質回答往往會產生較高的偏誤分數,且這種現象在能力較低的模型中更為明顯。這反映了現有測量方法的一個重要局限性——容易受到回答品質和模型能力的干擾,導致結果不準確。
為了解決此問題,我們提出了一種新方法——SALIERI,透過配對品質相似的回答來消除這些干擾,從而量化真正的偏誤程度。實驗結果顯示,在Summeval資料集上,SALIERI將LLaMA 2 7B模型中高品質與低品質回答的偏誤分數差距從2.55降至0.75,並將能力較強的Llama 3 8B模型的差距從1.44降至0.69。這些結果表明,SALIERI能有效提高自我提升偏誤測量的準確性與可靠性。 Self-enhancement bias refers to the tendency of models to overrate their own responses. However, we found that lower-quality responses tend to produce higher bias scores, an issue exacerbated in less capable models. This highlights a key limitation of current measurement methods, which are confounded by response quality and model capability, leading to inaccuracies. To address this, we propose SALIERI, a method that pairs responses of similar quality to isolate true bias. Experiments on the Summeval dataset show that SALIERI reduced bias score gaps between high- and low-quality responses from 2.55 to 0.75 for the less capable LLaMA 2 7B model and from 1.44 to 0.69 for the more advanced Llama 3 8B model. These results demonstrate SALIERI’s effectiveness in achieving more accurate and reliable measurements of self-enhancement bias. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97563 |
| DOI: | 10.6342/NTU202500980 |
| Fulltext Rights: | 同意授權(全球公開) |
| metadata.dc.date.embargo-lift: | 2025-07-03 |
| Appears in Collections: | 資訊工程學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-113-2.pdf | 564.36 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
