LLMs obtain remarkable performance but suffer from hallucinations. Most research on detecting hallucination focuses on the questions with short and concrete correct answers that are easy to check the faithfulness. Hallucination detections for text generation with open-ended answers are more challenging. Some researchers use external knowledge to detect hallucinations in generated texts, but external resources for specific scenarios are hard to access. Recent studies on detecting hallucinations in long text without external resources conduct consistency comparison among multiple sampled outputs. To handle long texts, researchers split long texts into multiple facts and individually compare the consistency of each pairs of facts. However, these methods (1) hardly achieve alignment among multiple facts; (2) overlook dependencies between multiple contextual facts. In this paper, we propose a graph-based context-aware (GCA) hallucination detection for text generations, which aligns knowledge facts and considers the dependencies between contextual knowledge triples in consistency comparison. Particularly, to align multiple facts, we conduct a triple-oriented response segmentation to extract multiple knowledge triples. To model dependencies among contextual knowledge triple (facts), we construct contextual triple into a graph and enhance triples' interactions via message passing and aggregating via RGCN. To avoid the omission of knowledge triples in long text, we conduct a LLM-based reverse verification via reconstructing the knowledge triples. Experiments show that our model enhances hallucination detection and excels all baselines.
翻译:大型语言模型取得了显著性能,但仍受幻觉问题困扰。现有幻觉检测研究多集中于答案简短具体、易于验证真实性的问题。针对开放式答案文本生成的幻觉检测更具挑战性。部分研究者借助外部知识检测生成文本中的幻觉,但特定场景的外部资源难以获取。近期无需外部资源的长文本幻觉检测研究通过对多个采样输出进行一致性比较来实现。为处理长文本,研究者将长文本拆分为多个事实单元,逐对比较各事实间的一致性。然而这些方法存在以下局限:(1) 难以实现多事实间的对齐;(2) 忽略多个上下文事实间的依赖关系。本文提出基于图结构的上下文感知幻觉检测方法,通过在一致性比较中对齐知识事实并建模上下文知识三元组间的依赖关系。具体而言,为实现多事实对齐,我们采用面向三元组的响应分割方法提取多个知识三元组。为建模上下文知识三元组(事实)间的依赖关系,我们将上下文三元组构建为图结构,通过RGCN进行消息传递与聚合以增强三元组交互。为避免长文本中知识三元组的遗漏,我们通过重构知识三元组实施基于LLM的反向验证。实验表明,本模型显著提升幻觉检测性能,优于所有基线方法。