While recent Large Language Models (LLMs) have proven useful in answering user queries, they are prone to hallucination, and their responses often lack credibility due to missing references to reliable sources. An intuitive solution to these issues would be to include in-text citations referring to external documents as evidence. While previous works have directly prompted LLMs to generate in-text citations, their performances are far from satisfactory, especially when it comes to smaller LLMs. In this work, we propose an effective training framework using fine-grained rewards to teach LLMs to generate highly supportive and relevant citations, while ensuring the correctness of their responses. We also conduct a systematic analysis of applying these fine-grained rewards to common LLM training strategies, demonstrating its advantage over conventional practices. We conduct extensive experiments on Question Answering (QA) datasets taken from the ALCE benchmark and validate the model's generalizability using EXPERTQA. On LLaMA-2-7B, the incorporation of fine-grained rewards achieves the best performance among the baselines, even surpassing that of GPT-3.5-turbo.
翻译:尽管近期的大型语言模型(LLMs)在回答用户查询方面表现出实用性,但它们容易产生幻觉,且其回答因缺乏对可靠来源的引用而常欠缺可信度。针对这些问题,一种直观的解决方案是在回答中加入引用外部文档的文本内引文(in-text citations)。虽然以往研究尝试直接提示LLMs生成文本内引文,但其性能远未令人满意,尤其是对于较小的LLMs。本研究提出了一种有效的训练框架,通过细粒度奖励(fine-grained rewards)来教导LLMs生成高度支持性且相关的引文,同时确保回答的正确性。此外,我们对这些细粒度奖励在常见LLM训练策略中的应用进行了系统性分析,证明了其相对于传统方法的优势。我们在ALCE基准测试中的问答(QA)数据集上进行了广泛实验,并利用EXPERTQA验证了模型的泛化能力。在LLaMA-2-7B模型上,结合细粒度奖励的方法在基线中取得了最佳性能,甚至超越了GPT-3.5-turbo。