MIXRAG : Mixture-of-Experts Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering

Large Language Models (LLMs) have achieved impressive performance across a wide range of applications. However, they often suffer from hallucinations in knowledge-intensive domains due to their reliance on static pretraining corpora. To address this limitation, Retrieval-Augmented Generation (RAG) enhances LLMs by incorporating external knowledge sources during inference. Among these sources, textual graphs provide structured and semantically rich information that supports more precise and interpretable reasoning. This has led to growing interest in graph-based RAG systems. Despite their potential, most existing approaches rely on a single retriever to identify relevant subgraphs, which limits their ability to capture the diverse aspects of complex queries. Moreover, these systems often struggle to accurately judge the relevance of retrieved content, making them prone to distraction by irrelevant noise. To address these challenges, in this paper, we propose MIXRAG, a Mixture-of-Experts Graph-RAG framework that introduces multiple specialized graph retrievers and a dynamic routing controller to better handle diverse query intents. Each retriever is trained to focus on a specific aspect of graph semantics, such as entities, relations, or subgraph topology. A Mixture-of-Experts module adaptively selects and fuses relevant retrievers based on the input query. To reduce noise in the retrieved information, we introduce a query-aware GraphEncoder that carefully analyzes relationships within the retrieved subgraphs, highlighting the most relevant parts while down-weighting unnecessary noise. Empirical results demonstrate that our method achieves state-of-the-art performance and consistently outperforms various baselines. MIXRAG is effective across a wide range of graph-based tasks in different domains. The code will be released upon paper acceptance.

翻译：大型语言模型（LLM）在广泛的应用中取得了令人瞩目的性能。然而，由于依赖静态的预训练语料库，它们在知识密集型领域常常遭受幻觉问题。为应对这一局限，检索增强生成（RAG）通过在推理过程中融入外部知识源来增强LLM。在这些知识源中，文本图提供了结构化且语义丰富的信息，支持更精确和可解释的推理。这引发了人们对基于图的RAG系统日益增长的兴趣。尽管潜力巨大，现有方法大多依赖单一检索器来识别相关子图，这限制了其捕捉复杂查询多样化方面的能力。此外，这些系统通常难以准确判断检索内容的相关性，使其容易受到无关噪声的干扰。为解决这些挑战，本文提出MIXRAG，一种专家混合图RAG框架，该框架引入了多个专用图检索器和一个动态路由控制器，以更好地处理多样化的查询意图。每个检索器被训练专注于图语义的特定方面，例如实体、关系或子图拓扑结构。一个专家混合模块根据输入查询自适应地选择并融合相关检索器。为减少检索信息中的噪声，我们引入了一个查询感知的GraphEncoder，它仔细分析检索子图内的关系，突出最相关的部分，同时降低不必要噪声的权重。实证结果表明，我们的方法实现了最先进的性能，并持续优于各种基线模型。MIXRAG在不同领域的多种基于图的任务中均表现出有效性。代码将在论文录用后发布。