In multi-hop reasoning, multi-round retrieval-augmented generation (RAG) methods typically rely on LLM-generated content as the retrieval query. However, these approaches are inherently vulnerable to knowledge overshadowing - a phenomenon where critical information is overshadowed during generation. As a result, the LLM-generated content may be incomplete or inaccurate, leading to irrelevant retrieval and causing error accumulation during the iteration process. To address this challenge, we propose ActiShade, which detects and activates overshadowed knowledge to guide large language models (LLMs) in multi-hop reasoning. Specifically, ActiShade iteratively detects the overshadowed keyphrase in the given query, retrieves documents relevant to both the query and the overshadowed keyphrase, and generates a new query based on the retrieved documents to guide the next-round iteration. By supplementing the overshadowed knowledge during the formulation of next-round queries while minimizing the introduction of irrelevant noise, ActiShade reduces the error accumulation caused by knowledge overshadowing. Extensive experiments show that ActiShade outperforms existing methods across multiple datasets and LLMs.
翻译:在多跳推理任务中,多轮检索增强生成方法通常依赖大语言模型生成的内容作为检索查询。然而,这些方法本质上容易受到知识遮蔽现象的影响——即关键信息在生成过程中被遮蔽。因此,大语言模型生成的内容可能不完整或不准确,导致检索结果不相关,并在迭代过程中造成误差累积。为应对这一挑战,我们提出了ActiShade,该方法通过检测并激活被遮蔽的知识来指导大语言模型进行多跳推理。具体而言,ActiShade迭代地检测给定查询中被遮蔽的关键短语,检索与查询及被遮蔽关键短语均相关的文档,并基于检索到的文档生成新查询以指导下一轮迭代。通过在构建下一轮查询时补充被遮蔽的知识,同时最小化无关噪声的引入,ActiShade有效降低了因知识遮蔽导致的误差累积。大量实验表明,ActiShade在多个数据集和大语言模型上均优于现有方法。