ContextFocus：面向大语言模型上下文忠实性的激活导向方法 (ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models)

Large Language Models (LLMs) encode vast amounts of parametric knowledge during pre-training. As world knowledge evolves, effective deployment increasingly depends on their ability to faithfully follow externally retrieved context. When such evidence conflicts with the model's internal knowledge, LLMs often default to memorized facts, producing unfaithful outputs. In this work, we introduce ContextFocus, a lightweight activation steering approach that improves context faithfulness in such knowledge-conflict settings while preserving fluency and efficiency. Unlike prior approaches, our solution requires no model finetuning and incurs minimal inference-time overhead, making it highly efficient. We evaluate ContextFocus on the ConFiQA benchmark, comparing it against strong baselines including ContextDPO, COIECD, and prompting-based methods. Furthermore, we show that our method is complementary to prompting strategies and remains effective on larger models. Extensive experiments show that ContextFocus significantly improves contextual-faithfulness. Our results highlight the effectiveness, robustness, and efficiency of ContextFocus in improving contextual-faithfulness of LLM outputs.

翻译：大语言模型在预训练过程中编码了海量的参数化知识。随着世界知识的演进，其有效部署日益依赖于模型忠实遵循外部检索上下文的能力。当此类证据与模型内部知识发生冲突时，大语言模型常默认依赖记忆的事实，从而产生不忠实的输出。本文提出ContextFocus，一种轻量级的激活导向方法，在知识冲突场景下提升上下文忠实性的同时保持流畅性与效率。与先前方法不同，本方案无需模型微调且推理时开销极低，具有高效性。我们在ConFiQA基准上评估ContextFocus，并与包括ContextDPO、COIECD及基于提示的方法在内的强基线进行比较。此外，我们证明该方法与提示策略具有互补性，且在更大模型上依然有效。大量实验表明，ContextFocus显著提升了上下文忠实性。研究结果凸显了ContextFocus在改进大语言模型输出上下文忠实性方面的有效性、鲁棒性与高效性。