Large language models (LLMs) have been used to generate query expansions augmenting original queries for improving information search. Recent studies also explore providing LLMs with initial retrieval results to generate query expansions more grounded to document corpus. However, these methods mostly focus on enhancing textual similarities between search queries and target documents, overlooking document relations. For queries like "Find me a highly rated camera for wildlife photography compatible with my Nikon F-Mount lenses", existing methods may generate expansions that are semantically similar but structurally unrelated to user intents. To handle such semi-structured queries with both textual and relational requirements, in this paper we propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG). To further address the limitation of entity-based scoring in existing KG-based methods, we leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR). Extensive experiments on three datasets of diverse domains show the advantages of our method compared against state-of-the-art baselines on textual and relational semi-structured retrieval.
翻译:大语言模型(LLM)已被用于生成查询扩展,通过增强原始查询以改进信息检索。近期研究还探索为LLM提供初始检索结果,以生成更贴近文档语料库的查询扩展。然而,这些方法主要侧重于提升搜索查询与目标文档之间的文本相似性,忽略了文档间的关系。对于诸如“为我推荐一款适用于尼康F卡口镜头、适合野生动物摄影的高评分相机”这类查询,现有方法生成的扩展可能在语义上相似,但在结构上与用户意图无关。为处理此类同时包含文本与关系需求的半结构化查询,本文提出一种知识感知的查询扩展框架,通过知识图谱(KG)中的结构化文档关系增强LLM。为进一步解决现有基于KG方法中基于实体评分的局限性,我们利用文档文本作为丰富的KG节点表示,并采用基于文档的关系过滤技术实现知识感知检索(KAR)。在三个不同领域数据集上的大量实验表明,相较于最先进的基线方法,我们的方法在半结构化文本与关系检索任务中具有显著优势。