Hallucination detection is critical for ensuring the reliability of large language models (LLMs) in context-based generation. Prior work has explored intrinsic signals available during generation, among which attention offers a direct view of grounding behavior. However, existing approaches typically rely on coarse summaries that fail to capture fine-grained instabilities in attention. Inspired by signal processing, we introduce a frequency-aware perspective on attention by analyzing its variation during generation. We model attention distributions as discrete signals and extract high-frequency components that reflect rapid local changes in attention. Our analysis reveals that hallucinated tokens are associated with high-frequency attention energy, reflecting fragmented and unstable grounding behavior. Based on this insight, we develop a lightweight hallucination detector using high-frequency attention features. Experiments on the RAGTruth and HalluRAG benchmarks show that our approach achieves performance gains over verification-based, internal-representation-based, and attention-based methods across models and tasks.
翻译:在基于上下文的生成任务中,幻觉检测对于确保大语言模型(LLMs)的可靠性至关重要。先前的研究探索了生成过程中可用的内在信号,其中注意力机制为模型的“锚定”行为提供了直接视角。然而,现有方法通常依赖于粗粒度的注意力汇总,未能捕捉注意力中细粒度的不稳定性。受信号处理启发,我们通过分析注意力在生成过程中的变化,引入了一种频率感知的视角。我们将注意力分布建模为离散信号,并提取反映注意力快速局部变化的高频分量。我们的分析表明,产生幻觉的标记与高频注意力能量相关,这反映了碎片化且不稳定的锚定行为。基于这一发现,我们利用高频注意力特征开发了一个轻量级的幻觉检测器。在RAGTruth和HalluRAG基准测试上的实验表明,我们的方法在多种模型和任务中,其性能均优于基于验证、基于内部表示以及基于注意力的现有方法。