Open-domain question answering (ODQA) has emerged as a pivotal research spotlight in information systems. Existing methods follow two main paradigms to collect evidence: (1) The \textit{retrieve-then-read} paradigm retrieves pertinent documents from an external corpus; and (2) the \textit{generate-then-read} paradigm employs large language models (LLMs) to generate relevant documents. However, neither can fully address multifaceted requirements for evidence. To this end, we propose LLMQA, a generalized framework that formulates the ODQA process into three basic steps: query expansion, document selection, and answer generation, combining the superiority of both retrieval-based and generation-based evidence. Since LLMs exhibit their excellent capabilities to accomplish various tasks, we instruct LLMs to play multiple roles as generators, rerankers, and evaluators within our framework, integrating them to collaborate in the ODQA process. Furthermore, we introduce a novel prompt optimization algorithm to refine role-playing prompts and steer LLMs to produce higher-quality evidence and answers. Extensive experimental results on widely used benchmarks (NQ, WebQ, and TriviaQA) demonstrate that LLMQA achieves the best performance in terms of both answer accuracy and evidence quality, showcasing its potential for advancing ODQA research and applications.
翻译:开放域问答(ODQA)已成为信息系统领域的关键研究热点。现有方法主要遵循两种证据收集范式:(1)“检索-阅读”范式从外部语料库中检索相关文档;(2)“生成-阅读”范式利用大型语言模型(LLMs)生成相关文档。然而,这两种方法均无法全面满足证据的多方面需求。为此,我们提出了LLMQA,这是一个通用框架,将ODQA过程分解为三个基本步骤:查询扩展、文档选择和答案生成,融合了基于检索和基于生成的证据优势。鉴于LLMs在完成各种任务中展现出卓越能力,我们在框架中引导LLMs扮演生成器、重排序器和评估器等多种角色,整合它们以协作完成ODQA过程。此外,我们引入了一种新颖的提示优化算法来精炼角色扮演提示,并引导LLMs产生更高质量的 evidence 和答案。在广泛使用的基准数据集(NQ、WebQ和TriviaQA)上进行的大量实验结果表明,LLMQA在答案准确性和证据质量方面均取得了最佳性能,展示了其在推动ODQA研究和应用方面的潜力。