By leveraging the retrieval of information from external knowledge databases, Large Language Models (LLMs) exhibit enhanced capabilities for accomplishing many knowledge-intensive tasks. However, due to the inherent flaws of current retrieval systems, there might exist irrelevant information within those retrieving top-ranked passages. In this work, we present a comprehensive investigation into the robustness of LLMs to different types of irrelevant information under various conditions. We initially introduce a framework to construct high-quality irrelevant information that ranges from semantically unrelated, partially related, and related to questions. Furthermore, our analysis demonstrates that the constructed irrelevant information not only scores highly on similarity metrics, being highly retrieved by existing systems, but also bears semantic connections to the context. Our investigation reveals that current LLMs still face challenges in discriminating highly semantically related information and can be easily distracted by these irrelevant yet misleading content. Besides, we also find that current solutions for handling irrelevant information have limitations in improving the robustness of LLMs to such distractions. All the resources are available on GitHub at https://github.com/Di-viner/LLM-Robustness-to-Irrelevant-Information.
翻译:通过从外部知识数据库中检索信息,大型语言模型(LLMs)在执行许多知识密集型任务时展现出增强的能力。然而,由于当前检索系统固有的缺陷,那些检索到的排名靠前的段落中可能存在无关信息。在这项工作中,我们对LLMs在不同条件下对各类无关信息的鲁棒性进行了全面研究。我们首先提出了一个框架,用于构建从语义无关、部分相关到与问题相关的多种高质量无关信息。此外,我们的分析表明,所构建的无关信息不仅在相似性度量上得分较高(易被现有系统检索到),而且与上下文存在语义关联。研究发现,当前LLMs在区分高度语义相关的信息方面仍面临挑战,且极易受到这些无关但具有误导性内容的干扰。此外,我们还发现当前处理无关信息的解决方案在提升LLMs对此类干扰的鲁棒性方面存在局限性。所有资源均可在GitHub上获取:https://github.com/Di-viner/LLM-Robustness-to-Irrelevant-Information。