Question answering over knowledge bases is considered a difficult problem due to the challenge of generalizing to a wide variety of possible natural language questions. Additionally, the heterogeneity of knowledge base schema items between different knowledge bases often necessitates specialized training for different knowledge base question-answering (KBQA) datasets. To handle questions over diverse KBQA datasets with a unified training-free framework, we propose KB-BINDER, which for the first time enables few-shot in-context learning over KBQA tasks. Firstly, KB-BINDER leverages large language models like Codex to generate logical forms as the draft for a specific question by imitating a few demonstrations. Secondly, KB-BINDER grounds on the knowledge base to bind the generated draft to an executable one with BM25 score matching. The experimental results on four public heterogeneous KBQA datasets show that KB-BINDER can achieve a strong performance with only a few in-context demonstrations. Especially on GraphQA and 3-hop MetaQA, KB-BINDER can even outperform the state-of-the-art trained models. On GrailQA and WebQSP, our model is also on par with other fully-trained models. We believe KB-BINDER can serve as an important baseline for future research. We plan to release all the code and data.
翻译:对知识库进行问答被认为是一个难题,因为需要泛化到各种可能的自然语言问题。此外,不同知识库之间知识库模式项的异构性往往需要针对不同的知识库问答数据集进行专门训练。为了用一个统一的无训练框架处理来自不同KBQA数据集的问题,我们提出了KB-BINDER,它首次实现了在KBQA任务上的少样本上下文学习。首先,KB-BINDER利用大型语言模型(如Codex),通过模仿少量示例为特定问题生成逻辑形式作为草稿。其次,KB-BINDER基于知识库,通过BM25分数匹配将生成的草稿绑定到可执行的逻辑形式。在四个公共异构KBQA数据集上的实验结果表明,KB-BINDER仅需少量上下文示例即可实现强大性能。特别是在GraphQA和3-hop MetaQA上,KB-BINDER甚至能超越最先进的预训练模型。在GrailQA和WebQSP上,我们的模型也与完全训练的模型性能相当。我们相信KB-BINDER可以作为未来研究的重要基线。我们计划发布所有代码和数据。