In this work, we explicitly show that modern LLMs tend to generate correct facts first, then "drift away" and generate incorrect facts later: this was occasionally observed but never properly measured. We develop a semantic drift score that measures the degree of separation between correct and incorrect facts in generated texts and confirm our hypothesis when generating Wikipedia-style biographies. This correct-then-incorrect generation pattern suggests that factual accuracy can be improved by knowing when to stop generation. Therefore, we explore the trade-off between information quantity and factual accuracy for several early stopping methods and manage to improve factuality by a large margin. We further show that reranking with semantic similarity can further improve these results, both compared to the baseline and when combined with early stopping. Finally, we try calling external API to bring the model back to the right generation path, but do not get positive results. Overall, our methods generalize and can be applied to any long-form text generation to produce more reliable information, by balancing trade-offs between factual accuracy, information quantity and computational cost.
翻译:在这项工作中,我们明确展示了现代大语言模型倾向于首先生成正确事实,随后“偏离轨道”并生成错误事实:这一现象曾被偶然观察到但从未被准确衡量。我们开发了一个语义漂移得分,用于衡量生成文本中正确与错误事实之间的分离程度,并在生成维基百科式传记时验证了我们的假设。这种“先正确后错误”的生成模式表明,通过知道何时停止生成,可以提升事实准确性。因此,我们探索了多种早期停止方法在信息量与事实准确性之间的权衡,并成功大幅提升了事实性。我们进一步证明,结合语义相似度的重排序可以进一步改善结果,无论是相对于基线还是与早期停止结合时。最后,我们尝试调用外部API将模型拉回正确的生成路径,但未获得积极结果。总体而言,我们的方法具有泛化能力,可通过平衡事实准确性、信息量与计算成本之间的权衡,适用于任何长文本生成任务以产生更可靠的信息。