Large language models (LLMs) have been used to generate query expansions augmenting original queries for improving information search. Recent studies also explore providing LLMs with initial retrieval results to generate query expansions more grounded to document corpus. However, these methods mostly focus on enhancing textual similarities between search queries and target documents, overlooking document relations. For queries like "Find me a highly rated camera for wildlife photography compatible with my Nikon F-Mount lenses", existing methods may generate expansions that are semantically similar but structurally unrelated to user intents. To handle such semi-structured queries with both textual and relational requirements, in this paper we propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG). To further address the limitation of entity-based scoring in existing KG-based methods, we leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR). Extensive experiments on three datasets of diverse domains show the advantages of our method compared against state-of-the-art baselines on textual and relational semi-structured retrieval.
翻译:大语言模型已被用于生成查询扩展以增强原始查询,从而改进信息搜索。近期研究也探索为LLMs提供初始检索结果,以生成更贴近文档语料库的查询扩展。然而,这些方法主要侧重于提升搜索查询与目标文档之间的文本相似性,忽略了文档间的关系。对于诸如"为我推荐一款适用于尼康F卡口镜头、适合野生动物摄影的高评分相机"这类查询,现有方法生成的扩展可能在语义上相似,但与用户意图在结构上缺乏关联。为处理这种同时包含文本与关系需求的半结构化查询,本文提出一种知识感知的查询扩展框架,通过知识图谱中的结构化文档关系增强LLMs。为进一步解决现有基于知识图谱的方法中基于实体的评分的局限性,我们利用文档文本作为丰富的知识图谱节点表示,并采用基于文档的关系过滤策略实现知识感知检索。在三个不同领域数据集上的大量实验表明,相较于最先进的基线方法,我们的方法在半结构化文本与关系检索任务中具有显著优势。