Question answering over knowledge bases is considered a difficult problem due to the challenge of generalizing to a wide variety of possible natural language questions. Additionally, the heterogeneity of knowledge base schema items between different knowledge bases often necessitates specialized training for different knowledge base question-answering (KBQA) datasets. To handle questions over diverse KBQA datasets with a unified training-free framework, we propose KB-BINDER, which for the first time enables few-shot in-context learning over KBQA tasks. Firstly, KB-BINDER leverages large language models like Codex to generate logical forms as the draft for a specific question by imitating a few demonstrations. Secondly, KB-BINDER grounds on the knowledge base to bind the generated draft to an executable one with BM25 score matching. The experimental results on four public heterogeneous KBQA datasets show that KB-BINDER can achieve a strong performance with only a few in-context demonstrations. Especially on GraphQA and 3-hop MetaQA, KB-BINDER can even outperform the state-of-the-art trained models. On GrailQA and WebQSP, our model is also on par with other fully-trained models. We believe KB-BINDER can serve as an important baseline for future research. Our code is available at https://github.com/ltl3A87/KB-BINDER.
翻译:知识库问答由于需要泛化到多种多样的自然语言问题而被视为一个难题。此外,不同知识库之间的模式项异构性通常需要针对不同的知识库问答数据集进行专门训练。为了用统一的无需训练的框架处理来自多种知识库问答数据集的问题,我们提出了KB-BINDER,该方法首次支持在知识库问答任务上进行小样本上下文学习。首先,KB-BINDER利用大型语言模型(如Codex),通过模仿少量示例,生成针对特定问题的逻辑形式草稿。其次,KB-BINDER基于知识库,利用BM25分数匹配将生成的草稿绑定为可执行的逻辑形式。在四个公开的异构知识库问答数据集上的实验结果表明,仅需少量上下文示例,KB-BINDER就能取得强劲性能。尤其是在GraphQA和3跳MetaQA上,KB-BINDER甚至能超越当前最优的已训练模型。在GrailQA和WebQSP上,我们的模型也与完全训练的其他模型表现持平。我们相信KB-BINDER可作为未来研究的重要基准。我们的代码开源在https://github.com/ltl3A87/KB-BINDER。