Open-domain question answering is a crucial task that often requires accessing external information. Existing methods typically adopt a single-turn retrieve-then-read approach, where relevant documents are first retrieved, and questions are then answered based on the retrieved information. However, there are cases where answering a question requires implicit knowledge that is not directly retrievable from the question itself. In this work, we propose a novel question-answering pipeline called BeamSearchQA. Our approach leverages large language models to iteratively generate new questions about the original question, enabling an iterative reasoning process. By iteratively refining and expanding the scope of the question, our method aims to capture and utilize hidden knowledge that may not be directly obtainable through retrieval. We evaluate our approach on the widely-used open-domain NQ and WebQ datasets. The experimental results demonstrate that BeamSearchQA significantly outperforms other zero-shot baselines, indicating its effectiveness in tackling the challenges of open-domain question answering.
翻译:开放域问答是一项关键任务,通常需要访问外部信息。现有方法通常采用单轮“检索-读取”方式,即首先检索相关文档,然后基于检索到的信息回答问题。然而,在某些情况下,回答一个问题需要隐含知识,而这些知识无法直接从问题本身检索得到。在本工作中,我们提出了一种新颖的问答管道,称为BeamSearchQA。我们的方法利用大型语言模型迭代生成关于原始问题的新问题,从而实现迭代推理过程。通过迭代细化并扩展问题的范围,我们的方法旨在捕获和利用那些可能无法通过检索直接获得的隐藏知识。我们在广泛使用的开放域NQ和WebQ数据集上评估了我们的方法。实验结果表明,BeamSearchQA显著优于其他零样本基线方法,表明了其在应对开放域问答挑战中的有效性。