The Retrieval-augmented generation (RAG) system based on Large language model (LLM) has made significant progress. It can effectively reduce factuality hallucinations, but faithfulness hallucinations still exist. Previous methods for detecting faithfulness hallucinations either neglect to capture the models' internal reasoning processes or handle those features coarsely, making it difficult for discriminators to learn. This paper proposes a semantic-level internal reasoning graph-based method for detecting faithfulness hallucination. Specifically, we first extend the layer-wise relevance propagation algorithm from the token level to the semantic level, constructing an internal reasoning graph based on attribution vectors. This provides a more faithful semantic-level representation of dependency. Furthermore, we design a general framework based on a small pre-trained language model to utilize the dependencies in LLM's reasoning for training and hallucination detection, which can dynamically adjust the pass rate of correct samples through a threshold. Experimental results demonstrate that our method achieves better overall performance compared to state-of-the-art baselines on RAGTruth and Dolly-15k.
翻译:基于大语言模型(LLM)的检索增强生成(RAG)系统已取得显著进展。该系统能有效减少事实性幻觉,但忠实性幻觉依然存在。先前检测忠实性幻觉的方法要么忽略了捕捉模型的内部推理过程,要么对这些特征的处理较为粗糙,导致判别器难以有效学习。本文提出一种基于语义级内部推理图的忠实性幻觉检测方法。具体而言,我们首先将分层相关性传播算法从词元级别扩展到语义级别,基于归因向量构建内部推理图。这为依赖关系提供了更忠实的语义级表征。此外,我们设计了一个基于小型预训练语言模型的通用框架,利用LLM推理中的依赖关系进行训练和幻觉检测,该框架可通过阈值动态调整正确样本的通过率。实验结果表明,在RAGTruth和Dolly-15k数据集上,与现有先进基线方法相比,我们的方法取得了更优的综合性能。