Retrieval-Augmented Generation (RAG) over Knowledge Graphs (KGs) suffers from the fact that indexing approaches may lose important contextual nuance when text is reduced to triples, thereby degrading performance in downstream Question-Answering (QA) tasks, particularly for multi-hop QA, which requires composing answers from multiple entities, facts, or relations. We propose a domain-agnostic, KG-based QA framework that covers both the indexing and retrieval/inference phases. A new indexing approach called Map-Disambiguate-Enrich-Reduce (MDER) generates context-derived triple descriptions and subsequently integrates them with entity-level summaries, thus avoiding the need for explicit traversal of edges in the graph during the QA retrieval phase. Complementing this, we introduce Decompose-Resolve (DR), a retrieval mechanism that decomposes user queries into resolvable triples and grounds them in the KG via iterative reasoning. Together, MDER and DR form an LLM-driven QA pipeline that is robust to sparse, incomplete, and complex relational data. Experiments show that on standard and domain specific benchmarks, MDER-DR achieves substantial improvements over standard RAG baselines (up to 66%), while maintaining cross-lingual robustness. Our code is available at https://github.com/DataSciencePolimi/MDER-DR_RAG.
翻译:基于知识图谱的检索增强生成面临一个挑战:当文本被简化为三元组时,索引方法可能会丢失重要的上下文细微差别,从而降低下游问答任务的性能,特别是对于需要从多个实体、事实或关系中组合答案的多跳问答。我们提出了一种与领域无关、基于知识图谱的问答框架,涵盖索引和检索/推理两个阶段。一种名为“映射-消歧-丰富-约简”的新索引方法生成源自上下文的三元组描述,随后将其与实体级摘要集成,从而避免了在问答检索阶段显式遍历图中的边。作为补充,我们引入了“分解-解析”检索机制,该机制将用户查询分解为可解析的三元组,并通过迭代推理将其锚定在知识图谱中。MDER和DR共同构成了一个由大语言模型驱动的问答流程,对稀疏、不完整和复杂的关系数据具有鲁棒性。实验表明,在标准和特定领域的基准测试中,MDER-DR相较于标准检索增强生成基线取得了显著改进(最高达66%),同时保持了跨语言鲁棒性。我们的代码可在 https://github.com/DataSciencePolimi/MDER-DR_RAG 获取。