While question answering over knowledge bases (KBQA) has shown progress in addressing factoid questions, KBQA with numerical reasoning remains relatively unexplored. In this paper, we focus on the complex numerical reasoning in KBQA and propose a new task, NR-KBQA, which necessitates the ability to perform both multi-hop reasoning and numerical reasoning. We design a logic form in Python format called PyQL to represent the reasoning process of numerical reasoning questions. To facilitate the development of NR-KBQA, we present a large dataset called MarkQA, which is automatically constructed from a small set of seeds. Each question in MarkQA is equipped with its corresponding SPARQL query, alongside the step-by-step reasoning process in the QDMR format and PyQL program. Experimental results of some state-of-the-art QA methods on the MarkQA show that complex numerical reasoning in KBQA faces great challenges.
翻译:尽管基于知识库的问答(KBQA)在事实性问题方面取得了进展,但涉及数值推理的KBQA仍然相对未被探索。本文聚焦于KBQA中的复杂数值推理,提出了一项新任务NR-KBQA,该任务要求同时具备多跳推理和数值推理的能力。我们设计了一种Python格式的逻辑形式PyQL,用于表示数值推理问题的推理过程。为促进NR-KBQA的发展,我们构建了一个名为MarkQA的大规模数据集,该数据集通过少量种子自动生成。MarkQA中的每个问题均附带对应的SPARQL查询,以及以QDMR格式和PyQL程序描述的逐步推理过程。在MarkQA上对若干先进问答方法的实验结果表明,KBQA中的复杂数值推理面临巨大挑战。