Current methods for Knowledge-Based Question Answering (KBQA) usually rely on complex training techniques and model frameworks, leading to many limitations in practical applications. Recently, the emergence of In-Context Learning (ICL) capabilities in Large Language Models (LLMs) provides a simple and training-free semantic parsing paradigm for KBQA: Given a small number of questions and their labeled logical forms as demo examples, LLMs can understand the task intent and generate the logic form for a new question. However, current powerful LLMs have little exposure to logic forms during pre-training, resulting in a high format error rate. To solve this problem, we propose a code-style in-context learning method for KBQA, which converts the generation process of unfamiliar logical form into the more familiar code generation process for LLMs. Experimental results on three mainstream datasets show that our method dramatically mitigated the formatting error problem in generating logic forms while realizing a new SOTA on WebQSP, GrailQA, and GraphQ under the few-shot setting.
翻译:当前基于知识的问题回答(KBQA)方法通常依赖于复杂的训练技术和模型框架,导致在实际应用中存在诸多限制。最近,大型语言模型(LLMs)中涌现的上下文学习(ICL)能力为KBQA提供了一种简单且无需训练的语义解析范式:给定少量问题及其标注的逻辑形式作为示例,LLMs能够理解任务意图并为新问题生成逻辑形式。然而,当前强大的LLMs在预训练过程中很少接触逻辑形式,导致格式错误率较高。为解决此问题,我们提出了一种面向KBQA的代码风格上下文学习方法,将不熟悉的逻辑形式生成过程转化为LLMs更熟悉的代码生成过程。在三个主流数据集上的实验结果表明,我们的方法显著缓解了逻辑形式生成中的格式错误问题,同时在少样本设置下,在WebQSP、GrailQA和GraphQ数据集上实现了新的最优结果(SOTA)。