On-the-fly retrieval of relevant knowledge has proven an essential element of reliable systems for tasks such as open-domain question answering and fact verification. However, because retrieval systems are not perfect, generation models are required to generate outputs given partially or entirely irrelevant passages. This can cause over- or under-reliance on context, and result in problems in the generated output such as hallucinations. To alleviate these problems, we propose FILCO, a method that improves the quality of the context provided to the generator by (1) identifying useful context based on lexical and information-theoretic approaches, and (2) training context filtering models that can filter retrieved contexts at test time. We experiment on six knowledge-intensive tasks with FLAN-T5 and LLaMa2, and demonstrate that our method outperforms existing approaches on extractive question answering (QA), complex multi-hop and long-form QA, fact verification, and dialog generation tasks. FILCO effectively improves the quality of context, whether or not it supports the canonical output.
翻译:实时检索相关知识已被证明是开放域问答和事实验证等任务中可靠系统的关键要素。然而,由于检索系统并非完美,生成模型需要在部分或完全不相关的段落中生成输出。这可能导致对上下文的过度依赖或依赖不足,进而引发生成的输出中出现幻觉等问题。为缓解这些问题,我们提出FILCO方法,通过(1)基于词汇和信息论方法识别有用上下文,(2)训练能够在测试时过滤检索结果的上下文过滤模型,来提升提供给生成器的上下文质量。我们在六个知识密集型任务上使用FLAN-T5和LLaMa2进行实验,结果表明,我们的方法在抽取式问答、复杂多跳与长文本问答、事实验证及对话生成任务上均优于现有方法。无论上下文是否支持标准输出,FILCO都能有效提升上下文质量。