Can we trust LLM Self-Explanations for Entity Resolution?

Large Language Models (LLMs) have recently shown strong performance on Entity Resolution (ER). Additionally, akin to their prowess in providing accurate predictions, these models often generate self-explanations alongside their predictions through prompting. While such self-explanations are appealing due to their negligible computational cost, their actual reliability remains largely unexplored. In this paper, we present the first large-scale systematic evaluation of LLM self-explanations for ER, focusing on feature attribution and counterfactual explanations at both the attribute and token levels. Across three LLMs, ten datasets, and multiple prompting strategies, we show that self-explanations are often unstable, weakly faithful, and poorly aligned with counterfactual evidence, revealing a substantial gap between plausibility and causal relevance. We further demonstrate that established post-hoc explanation methods provide significantly higher trustworthiness, but at a prohibitive computational cost when applied to LLMs. To bridge this gap, we introduce \uncerta{}, a hybrid explanation framework that leverages self-explanations as priors to guide post-hoc exploration. \uncerta{} achieves explanation quality comparable to post-hoc methods while reducing cost by up to an order of magnitude.

翻译：大语言模型（LLMs）近期在实体解析（ER）任务上展现出强劲性能。此外，与它们提供准确预测的能力类似，这些模型常通过提示机制在预测时生成自我解释。尽管此类自我解释因其极低计算成本而颇具吸引力，但其实际可靠性仍鲜有探究。本文首次对用于实体解析的 LLM 自我解释进行大规模系统性评估，聚焦于属性级和词元级的特征归因与反事实解释。通过对三种LLM、十个数据集及多种提示策略的评估，我们发现自我解释常呈现不稳定性、弱忠实性，且与反事实证据对齐性差，揭示了其表面合理性因果相关性之间的显著鸿沟。我们进一步证明，既有的事后解释方法能提供显著更高的可信度，但当应用于LLM时会产生高昂的计算代价。为弥合这一差距，我们提出\uncerta{}——一种混合解释框架，该框架利用自我解释作为先验引导事后探索。\uncerta{}在达到与事后方法相当的解释质量的同时，将计算成本降低了一个数量级。

相关内容

实体解析

关注 5

不同的数据提供方对同一个事物即实体 (Entity)可能会有不同的描述 (这里的描述包括数据格式、表示方法等) ，每一个对实体的描述称为该实体的一个引用。实体解析，是指从一个“ 引用集合”中解析并映射到现实世界中的“ 实体”过程。实体解析(Entity Resolution)又被称为记录链接(Record Linkage) 、对象识别(object Identification ) 、个体识别(Individual Identification) 、重复检测(Duplicate Detection)

LLM/智能体作为数据分析师：综述

专知会员服务

38+阅读 · 2025年9月30日

可信赖LLM智能体的研究综述：威胁与应对措施

专知会员服务

36+阅读 · 2025年3月17日

【新书】设计大型语言模型应用：一种面向LLMs的整体方法

专知会员服务

56+阅读 · 2025年3月16日

【ICLR2025】LLMS能否识别您的偏好？评估LLMS中的个性化偏好遵循能力

专知会员服务

14+阅读 · 2025年2月14日