In this paper, we present HalluCana, a canary lookahead to detect and correct factuality hallucinations of Large Language Models (LLMs) in long-form generation. HalluCana detects and intervenes as soon as traces of hallucination emerge, during and even before generation. To support timely detection, we exploit the internal factuality representation in the LLM hidden space, where we investigate various proxies to the LLMs' factuality self-assessment, and discuss its relation to the models' context familiarity from their pre-training. On biography generation, our method improves generation quality by up to 2.5x, while consuming over 6 times less compute.
翻译:本文提出HalluCana,一种前瞻性金丝雀检测方法,用于在长文本生成中检测并修正大语言模型的事实性幻觉。HalluCana能够在生成过程中、甚至在幻觉痕迹初现时即进行检测与干预。为实现及时检测,我们利用大语言模型隐空间中的内部事实性表征,探究了多种替代模型自我事实性评估的代理指标,并讨论了其与模型预训练中语境熟悉度的关联。在传记生成任务中,本方法将生成质量提升最高达2.5倍,同时计算消耗降低超过6倍。