Recent advancements in Large Language Models (LLMs) have significantly improved text generation capabilities, but these systems are still known to hallucinate, and granular uncertainty estimation for long-form LLM generations remains challenging. In this work, we propose Graph Uncertainty -- which represents the relationship between LLM generations and claims within them as a bipartite graph and estimates the claim-level uncertainty with a family of graph centrality metrics. Under this view, existing uncertainty estimation methods based on the concept of self-consistency can be viewed as using degree centrality as an uncertainty measure, and we show that more sophisticated alternatives such as closeness centrality provide consistent gains at claim-level uncertainty estimation. Moreover, we present uncertainty-aware decoding techniques that leverage both the graph structure and uncertainty estimates to improve the factuality of LLM generations by preserving only the most reliable claims. Compared to existing methods, our graph-based uncertainty metrics lead to an average of 6.8% relative gains on AUPRC across various long-form generation settings, and our end-to-end system provides consistent 2-4% gains in factuality over existing decoding techniques while significantly improving the informativeness of generated responses.
翻译:近年来,大型语言模型(LLMs)在文本生成能力上取得了显著进步,但这些系统仍存在幻觉问题,且针对长文本LLM生成的细粒度不确定性估计依然具有挑战性。本研究提出图不确定性方法——该方法将LLM生成内容与其内部主张之间的关系表示为二分图,并利用一系列图中心性度量来估计主张级别的不确定性。在此视角下,基于自一致性概念的现有不确定性估计方法可被视为使用度中心性作为不确定性度量;我们证明,更复杂的替代方案(如接近中心性)能在主张级别不确定性估计上带来持续提升。此外,我们提出了不确定性感知解码技术,该技术同时利用图结构信息与不确定性估计,通过仅保留最可靠的主张来提升LLM生成内容的真实性。相较于现有方法,我们基于图的不确定性度量在多种长文本生成场景下的平均AUPRC指标实现了6.8%的相对提升;我们的端到端系统在解码技术方面比现有方法持续提升2-4%的真实性,同时显著提高了生成响应的信息量。