Long chain-of-thought (CoT) reasoning improves the performance of large language models, yet hallucinations in such settings often emerge subtly and propagate across reasoning steps. We suggest that hallucination in long CoT reasoning is better understood as an evolving latent state rather than a one-off erroneous event. Accordingly, we treat step-level hallucination judgments as local observations and introduce a cumulative prefix-level hallucination signal that tracks the global evolution of the reasoning state over the entire trajectory. Overall, our approach enables streaming hallucination detection in long CoT reasoning, providing real-time, interpretable evidence.
翻译:长链思维(CoT)推理提升了大型语言模型的性能,然而在此类场景中,幻觉往往以微妙的方式出现并在推理步骤间传播。我们认为,长链CoT推理中的幻觉应被更好地理解为一种演化的潜在状态,而非一次性错误事件。因此,我们将步骤级别的幻觉判断视为局部观测,并引入一个累积的前缀级别幻觉信号,用于追踪整个推理轨迹上推理状态的全局演化。总体而言,我们的方法实现了长链CoT推理中的流式幻觉检测,提供了实时、可解释的证据。