The integration of retrieved passages and large language models (LLMs), such as ChatGPTs, has significantly contributed to improving open-domain question answering. However, there is still a lack of exploration regarding the optimal approach for incorporating retrieved passages into the answer generation process. This paper aims to fill this gap by investigating different methods of combining retrieved passages with LLMs to enhance answer generation. We begin by examining the limitations of a commonly-used concatenation approach. Surprisingly, this approach often results in generating "unknown" outputs, even when the correct document is among the top-k retrieved passages. To address this issue, we explore four alternative strategies for integrating the retrieved passages with the LLMs. These strategies include two single-round methods that utilize chain-of-thought reasoning and two multi-round strategies that incorporate feedback loops. Through comprehensive analyses and experiments, we provide insightful observations on how to effectively leverage retrieved passages to enhance the answer generation capability of LLMs.
翻译:检索段落与大语言模型(LLMs,如ChatGPTs)的集成为开放域问答性能的提升做出了重要贡献。然而,如何将检索段落最优地融入答案生成过程仍缺乏深入探索。本文旨在通过研究检索段落与大语言模型的不同结合方法来填补这一空白。我们首先剖析了常用拼接方法的局限性,令人惊讶的是,即使正确文档位于前k个检索结果中,该方法仍常导致生成"未知"输出。为解决该问题,我们探索了四种替代策略来整合检索段落与大语言模型,包括两种利用思维链推理的单轮方法和两种融入反馈循环的多轮策略。通过综合分析与实验,我们提出了关于如何有效利用检索段落增强大语言模型答案生成能力的前瞻性见解。