In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract entity attributes and relational constraints for both parsing and reranking the top-k answers, (2) vector similarity search for ranking both extracted entities and final answers, (3) a novel incremental scope expansion procedure that prepares for the reranking on a configurable amount of suitable candidates that fulfill the given constraints the most, and (4) a hybrid retrieval strategy that reduces error susceptibility. In summary, while constantly adjusting the focus like an optical autofocus, AF-Retriever delivers a configurable amount of answer candidates in four constraint-driven retrieval steps, which are then supplemented and ranked through four additional processing steps. An ablation study and a detailed error analysis, including a comparison of three different LLM reranking strategies, provide component-level insights. The source code is available at https://github.com/kramerlab/AF-Retriever .
翻译:在众多实际应用场景中,机器学习模型与交互系统需要同时处理结构化知识(如知识图谱或表格)和非结构化内容(如自然语言文档),但现有方法大多仅依赖其中一种。半结构化知识库通过将非结构化内容链接至结构化数据节点,填补了这一鸿沟。本文提出自动聚焦检索器(AF-Retriever),一种面向基于半结构化知识库的多跳问答的模块化框架。该框架通过创新的集成步骤与优化策略,将结构检索与文本检索相结合,在覆盖多个领域与评估指标的STaRK QA三项基准测试中均取得了最优的零样本与单样本结果。AF-Retriever的平均首答命中率超越次优方法达32.1%。其优异性能源于以下四点:(1)利用可交换的大语言模型提取实体属性与关系约束,用于解析并重排序前k个答案;(2)采用向量相似性搜索对提取的实体与最终答案进行排序;(3)提出新颖的增量式范围扩展流程,可根据配置为最符合给定约束的候选答案集进行重排序准备;(4)采用混合检索策略降低对错误的敏感性。简而言之,如同光学自动对焦系统般持续调整聚焦范围,AF-Retriever通过四个约束驱动的检索步骤生成可配置数量的候选答案,再通过四个后处理步骤进行补充与排序。消融实验与详细错误分析(含三种不同大语言模型重排序策略的对比)提供了组件级洞察。源代码已开源至https://github.com/kramerlab/AF-Retriever。