As Large Language Models (LLMs) are increasingly deployed in real-world settings, correctness alone is insufficient. Reliable deployment requires maintaining truthful beliefs under contextual perturbations. Existing evaluations largely rely on point-wise confidence like Self-Consistency, which can mask brittle belief. We show that even facts answered with perfect self-consistency can rapidly collapse under mild contextual interference. To address this gap, we propose Neighbor-Consistency Belief (NCB), a structural measure of belief robustness that evaluates response coherence across a conceptual neighborhood. To validate the efficiency of NCB, we introduce a new cognitive stress-testing protocol that probes outputs stability under contextual interference. Experiments across multiple LLMs show that the performance of high-NCB data is relatively more resistant to interference. Finally, we present Structure-Aware Training (SAT), which optimizes context-invariant belief structure and reduces long-tail knowledge brittleness by approximately 30%. Code is available at https://github.com/zjunlp/belief.
翻译:随着大语言模型(LLMs)越来越多地部署于现实场景,仅凭正确性已不足以满足需求。可靠部署要求模型在上下文扰动下保持忠实信念。现有评估方法主要依赖基于自我一致性的逐点置信度,这往往会掩盖脆弱的信念。我们发现,即使是能以完美自我一致性回答的事实,在温和的上下文干扰下也会迅速崩塌。为弥补这一不足,我们提出邻域一致性信念(NCB),这是一种从结构上衡量信念鲁棒性的指标,可评估概念邻域内响应的连贯性。为验证NCB的有效性,我们引入一种新的认知压力测试协议,用于探测模型在上下文干扰下的输出稳定性。跨多个LLMs的实验表明,高NCB数据的性能相对更能抵抗干扰。最后,我们提出结构感知训练(SAT),该训练优化了上下文不变的信念结构,并将长尾知识脆弱性降低了约30%。代码已开源:https://github.com/zjunlp/belief。