Focus, Merge, Rank: Improved Question Answering Based on Semi-structured Knowledge Bases

In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. However, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data, thereby enabling new strategies for knowledge access and use. In this work, we present FocusedRetriever, a modular SKB-based framework for multi-hop question answering. It integrates components (VSS-based entity search, LLM-based generation of Cypher queries and pairwise re-ranking) in a way that enables it to outperform state-of-the-art methods across all three STaRK benchmark test sets, covering diverse domains and multiple performance metrics. The average first-hit rate exceeds that of the second-best method by 25.7%. FocusedRetriever leverages (1) the capacity of Large Language Models (LLMs) to extract relational facts and entity attributes from unstructured text, (2) node set joins to filter answer candidates based on these extracted triplets and constraints, (3) vector similarity search to retrieve and rank relevant unstructured content, and (4) the contextual capabilities of LLMs to finally rank the top-k answers. For generality, we only incorporate base LLMs in FocusedRetriever in our evaluation. However, our analysis of intermediate results highlights several opportunities for further upgrades including finetuning. The source code is publicly available at https://github.com/kramerlab/FocusedRetriever .

翻译：在许多现实场景中，机器学习模型和交互系统能够同时访问结构化知识（例如知识图谱或表格）和非结构化内容（例如自然语言文档）。然而，现有方法大多仅依赖其中一种数据形式。半结构化知识库通过将非结构化内容链接至结构化数据中的节点，弥合了这一鸿沟，从而为知识访问与利用提供了新策略。本研究提出FocusedRetriever——一个基于半结构化知识库的模块化多跳问答框架。该框架通过整合向量相似度搜索实体检索、基于LLM的Cypher查询生成及配对重排序等组件，在涵盖多领域、多评估指标的STaRK基准测试集中，全面超越了现有最优方法。其平均首次命中率较次优方法提升25.7%。FocusedRetriever的创新性体现在：（1）利用大语言模型从非结构化文本中提取关系事实与实体属性；（2）通过节点集连接操作，基于提取的三元组和约束条件筛选候选答案；（3）运用向量相似度搜索检索并排序相关非结构化内容；（4）最终借助大语言模型的上下文理解能力对top-k答案进行重排序。为保持通用性，评估中仅使用基础版大语言模型。但对中间结果的分析表明，通过微调等优化手段仍存在显著提升空间。源代码已公开于https://github.com/kramerlab/FocusedRetriever。