We address the task of predicting the gain of using RAG (retrieval augmented generation) for question answering with respect to not using it. We study the performance of a few pre-retrieval and post-retrieval predictors originally devised for ad hoc retrieval. We also study a few post-generation predictors, one of which is novel to this study and posts the best prediction quality. Our results show that the most effective prediction approach is a novel supervised predictor that explicitly models the semantic relationships among the question, retrieved passages, and the generated answer.
翻译:我们研究了在问答任务中使用RAG(检索增强生成)相较于不使用时带来的收益预测问题。我们分析了几种原本为即席检索设计的预检索与后检索预测器的性能表现,同时探索了若干后生成预测器——其中一种为本研究首次提出的新型方法,并取得了最优的预测质量。实验结果表明,最有效的预测方法是一种新型监督式预测器,该模型显式建模了问题、检索段落与生成答案之间的语义关系。