The similarity between the question and indexed documents is a crucial factor in document retrieval for retrieval-augmented question answering. Although this is typically the only method for obtaining the relevant documents, it is not the sole approach when dealing with entity-centric questions. In this study, we propose Entity Retrieval, a novel retrieval method which rather than relying on question-document similarity, depends on the salient entities within the question to identify the retrieval documents. We conduct an in-depth analysis of the performance of both dense and sparse retrieval methods in comparison to Entity Retrieval. Our findings reveal that our method not only leads to more accurate answers to entity-centric questions but also operates more efficiently.
翻译:在检索增强型问答系统的文档检索中,问题与索引文档之间的相似度是关键因素。尽管这通常是获取相关文档的唯一方法,但在处理实体中心问题时并非唯一途径。本研究提出实体检索方法,这是一种新颖的检索方法,其不依赖于问题-文档相似度,而是通过问题中的显著实体来识别检索文档。我们深入分析了稠密检索与稀疏检索方法相较于实体检索的性能表现。研究结果表明,我们的方法不仅能为实体中心问题提供更准确的答案,而且运行效率更高。