By leveraging the retrieval of information from external knowledge databases, Large Language Models (LLMs) exhibit enhanced capabilities for accomplishing many knowledge-intensive tasks. However, due to the inherent flaws of current retrieval systems, there might exist irrelevant information within those retrieving top-ranked passages. In this work, we present a comprehensive investigation into the robustness of LLMs to different types of irrelevant information under various conditions. We initially introduce a framework to construct high-quality irrelevant information that ranges from semantically unrelated, partially related, and related to questions. Furthermore, our analysis demonstrates that the constructed irrelevant information not only scores highly on similarity metrics, being highly retrieved by existing systems, but also bears semantic connections to the context. Our investigation reveals that current LLMs still face challenges in discriminating highly semantically related information and can be easily distracted by these irrelevant yet misleading content. Besides, we also find that current solutions for handling irrelevant information have limitations in improving the robustness of LLMs to such distractions. All the resources are available on GitHub at https://github.com/Di-viner/LLM-Robustness-to-Irrelevant-Information.
翻译:通过从外部知识库检索信息,大型语言模型(LLMs)在执行知识密集型任务时展现出增强的能力。然而,由于当前检索系统固有的缺陷,在检索到的排名靠前的段落中可能存在无关信息。本研究对LLMs在不同条件下对各类无关信息的鲁棒性进行了全面探究。我们首先提出了一个构建高质量无关信息的框架,这些信息涵盖与问题语义无关、部分相关及相关等不同情况。进一步分析表明,所构建的无关信息不仅在相似性度量上得分较高(易被现有系统检索),而且与上下文存在语义关联。研究发现,当前LLMs在区分高度语义相关的信息方面仍面临挑战,极易受到这类无关但具有误导性内容的干扰。此外,我们还发现当前处理无关信息的解决方案在提升LLMs对此类干扰的鲁棒性方面存在局限。所有资源已在GitHub开源:https://github.com/Di-viner/LLM-Robustness-to-Irrelevant-Information。