Large Language Models (LLMs) are prone to factual hallucinations, risking their reliability in real-world applications. Existing hallucination detectors mainly extract micro-level intrinsic patterns for uncertainty quantification or elicit macro-level self-judgments through verbalized prompts. However, these methods address only a single facet of the hallucination, focusing either on implicit neural uncertainty or explicit symbolic reasoning, thereby treating these inherently coupled behaviors in isolation and failing to exploit their interdependence for a holistic view. In this paper, we propose LaaB (Logical Consistency-as-a-Bridge), a framework that bridges neural features and symbolic judgments for hallucination detection. LaaB introduces a "meta-judgment" process to map symbolic labels back into the feature space. By leveraging the inherent logical bridge where response and meta-judgment labels are either the same or opposite based on the self-judgment's semantics, LaaB aligns and integrates dual-view signals via mutual learning and enhances the hallucination detection. Extensive experiments on 4 public datasets, across 4 LLMs, against 8 baselines demonstrate the superiority of LaaB.
翻译:大型语言模型(LLMs)易产生事实性幻觉,损害了其在实际应用中的可靠性。现有幻觉检测器主要提取微观内在模式进行不确定性量化,或通过语言化提示引出宏观自我判断。然而,这些方法仅处理幻觉的单一侧面,分别聚焦于隐式神经不确定性或显式符号推理,将这些本质上耦合的行为孤立对待,未能利用其相互依赖关系实现整体视角。本文提出LaaB(逻辑一致性作为桥梁),一种将神经特征与符号判断桥接的幻觉检测框架。LaaB引入"元判断"过程将符号标签映射回特征空间。通过利用响应与元判断标签基于自我判断语义要么相同要么相反的内在逻辑桥梁,LaaB通过相互学习对齐并整合双视角信号,增强幻觉检测能力。在4个公共数据集、4种LLM、8个基线模型上的广泛实验证明了LaaB的优越性。