Retrieval-augmented Large Language Models (LLMs) offer substantial benefits in enhancing performance across knowledge-intensive scenarios. However, these methods often face challenges with complex inputs and encounter difficulties due to noisy knowledge retrieval, notably hindering model effectiveness. To address this issue, we introduce BlendFilter, a novel approach that elevates retrieval-augmented LLMs by integrating query generation blending with knowledge filtering. BlendFilter proposes the blending process through its query generation method, which integrates both external and internal knowledge augmentation with the original query, ensuring comprehensive information gathering. Additionally, our distinctive knowledge filtering module capitalizes on the intrinsic capabilities of the LLM, effectively eliminating extraneous data. We conduct extensive experiments on three open-domain question answering benchmarks, and the findings clearly indicate that our innovative BlendFilter surpasses state-of-the-art baselines significantly.
翻译:检索增强型大语言模型在知识密集型场景中显著提升了性能表现。然而,这些方法在处理复杂输入时常常面临挑战,并因知识检索中的噪声问题而遇到困难,这显著阻碍了模型效能。为解决这一问题,我们提出了BlendFilter——一种通过整合查询生成融合与知识过滤来提升检索增强型大语言模型的新方法。BlendFilter通过其查询生成方法提出融合过程,该方法将外部与内部知识增强与原始查询相结合,确保信息的全面收集。此外,我们独特的知识过滤模块利用大语言模型的内在能力,有效剔除无关数据。我们在三个开放域问答基准上进行了大量实验,结果明确表明我们创新的BlendFilter方法显著超越了现有先进基线模型。