In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In this work, we present Autofocus-Retriever (AF-Retriever), a modular framework for SKB-based, multi-hop question answering. It combines structural and textual retrieval through novel integration steps and optimizations, achieving the best zero- and one-shot results across all three STaRK QA benchmarks, which span diverse domains and evaluation metrics. AF-Retriever's average first-hit rate surpasses the second-best method by 32.1%. Its performance is driven by (1) leveraging exchangeable large language models (LLMs) to extract entity attributes and relational constraints for both parsing and reranking the top-k answers, (2) vector similarity search for ranking both extracted entities and final answers, (3) a novel incremental scope expansion procedure that prepares for the reranking on a configurable amount of suitable candidates that fulfill the given constraints the most, and (4) a hybrid retrieval strategy that reduces error susceptibility. In summary, while constantly adjusting the focus like an optical autofocus, AF-Retriever delivers a configurable amount of answer candidates in four constraint-driven retrieval steps, which are then supplemented and ranked through four additional processing steps. An ablation study and a detailed error analysis, including a comparison of three different LLM reranking strategies, provide component-level insights. The source code is available at https://github.com/kramerlab/AF-Retriever.
翻译:在许多现实场景中,机器学习模型和交互系统可以同时访问结构化知识(如知识图谱或表格)和非结构化内容(如自然语言文档)。然而,大多数系统仅依赖其中一种。半结构化知识库(SKB)通过将非结构化内容链接到结构化数据中的节点来弥合这一差距。本文提出自动聚焦检索器(AF-Retriever),这是一个用于基于SKB的多跳问答的模块化框架。它通过新颖的集成步骤和优化方法,结合了结构检索和文本检索,在涵盖不同领域和评估指标的三个STaRK QA基准测试中均取得了最佳的零样本和少样本结果。AF-Retriever的平均首次命中率比次优方法高出32.1%。其优异性能源于:(1)利用可互换的大型语言模型(LLM)提取实体属性和关系约束,用于解析和重排top-k答案;(2)使用向量相似性搜索对提取的实体和最终答案进行排序;(3)一种新颖的增量范围扩展过程,为在可配置数量的、最符合给定约束的候选答案上进行重排做好准备;(4)一种降低错误敏感性的混合检索策略。总而言之,AF-Retriever如同光学自动对焦般持续调整焦点,通过四个约束驱动的检索步骤生成可配置数量的答案候选,随后通过四个额外的处理步骤进行补充和排序。消融研究和详细的错误分析(包括对三种不同LLM重排策略的比较)提供了组件层面的深入见解。源代码可在https://github.com/kramerlab/AF-Retriever获取。