LAREV：具洩漏感知的推理評估方法——以條件 V-資訊為基礎

李泓賢; Hong-Sian Li

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101846

Title:	LAREV：具洩漏感知的推理評估方法——以條件 V-資訊為基礎 LAREV: Leakage Aware Rationale Evaluation with Conditional V-information
Authors:	李泓賢 Hong-Sian Li
Advisor:	許永真 Yung-Jen Hsu
Co-Advisor:	陳縕儂 Yun-Nung Chen
Keyword:	推理評估,標籤洩漏洩漏感知資訊理論條件 V-資訊大型語言模型 Rationale Evaluation,Label LeakageLeakage-Aware EvaluationInformation TheoryConditional V-InformationLarge Language Models
Publication Year :	2026
Degree:	碩士
Abstract:	隨著大型語言模型在各類推理任務中廣泛採用，模型所產生之自然語言推理（rationale）已成為輔助決策與提升可解釋性的重要工具。然而，如何客觀評估推理文本本身是否真正提供額外、且與標籤相關的有用資訊，仍是一項具挑戰性的研究問題。近期研究提出以條件 V-資訊（Conditional V-information）為基礎的推理評估方法，在給定基準輸入（baseline input）的條件下，比較模型在有無推理文本時的預測行為，以估計推理相對於基準所帶來的可用資訊量。然而，在實務資料集中，此類基準輸入本身往往已包含高度的標籤洩漏（label leakage），使得評估模型可能過度依賴基準訊號，進而錯估推理品質，降低評估結果的可靠性。本論文提出 LAREV（Leakage-Aware Rationale Evaluation with Conditional V-information），一個具洩漏感知能力的推理評估框架，旨在提升條件式推理評估在基準輸入有洩漏的情境下之穩健性。LAREV 在不改變原有評估指標形式的前提下，透過兩項關鍵設計約束評估模型的學習行為：其一，利用不變風險最小化（Invariant Risk Minimization, IRM）降低模型對基準輸入中特定線索的依賴；其二，引入洩漏探測模型以量化並懲罰評估模型從基準輸入中提取標籤相關資訊的能力，從而引導模型將預測改善歸因於推理文本本身。實驗結果顯示，在 ECQA 與 e-SNLI 等具代表性的推理資料集中，LAREV 能有效拉大高品質推理與低品質推理之間的評估差距，並展現出較既有方法更穩定且具辨識力的排序行為。此外，分析結果亦顯示，LAREV 在降低由基準輸入中洩漏的依賴性之同時，仍能維持具競爭力的預測表現，驗證其作為一種可靠推理評估框架的實用性。 With the widespread adoption of large language models in reasoning tasks, natural language rationales generated by models have become an important tool for supporting decision-making and improving interpretability. However, objectively evaluating whether a rationale truly provides additional, label-relevant information remains a challenging research problem. Recent work has proposed rationale evaluation methods based on Conditional V-information, which estimate the usable information contributed by a rationale by comparing model predictions with and without access to the rationale, conditioned on a given baseline input. In practice, however, such baseline inputs often already contain substantial label leakage, causing evaluation models to over-rely on baseline signals and systematically misestimate rationale quality, thereby undermining the reliability of evaluation results. In this thesis, we propose LAREV (Leakage-Aware Rationale Evaluation with Conditional V-information), a leakage-aware rationale evaluation framework designed to improve the robustness of baseline-conditioned evaluation under baseline leakage. Without altering the original evaluation metric, LAREV introduces two key training constraints on the evaluator model. First, it applies Invariant Risk Minimization (IRM) to reduce the model's reliance on spurious, baseline-specific cues. Second, it incorporates a leakage probing model to quantify and penalize the evaluator's ability to extract label-relevant information from the baseline input, thereby encouraging the model to attribute predictive improvements to the rationale itself. Experimental results on representative reasoning benchmarks, including ECQA and e-SNLI, demonstrate that LAREV effectively enlarges the evaluation gap between high-quality and low-quality rationales, and exhibits more stable and discriminative ranking behavior than existing methods. Further analysis shows that LAREV reduces dependence on leaked baseline information while maintaining competitive predictive performance, validating its effectiveness as a reliable framework for rationale evaluation.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101846
DOI:	10.6342/NTU202600527
Fulltext Rights:	同意授權(全球公開)
metadata.dc.date.embargo-lift:	2026-03-06
Appears in Collections:	資訊網路與多媒體研究所

Files in This Item:

File	Size	Format
ntu-114-1.pdf	1.52 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets