Knowledge base question generation (KBQG) aims to generate natural language questions from a set of triplet facts extracted from KB. Existing methods have significantly boosted the performance of KBQG via pre-trained language models (PLMs) thanks to the richly endowed semantic knowledge. With the advance of pre-training techniques, large language models (LLMs) (e.g., GPT-3.5) undoubtedly possess much more semantic knowledge. Therefore, how to effectively organize and exploit the abundant knowledge for KBQG becomes the focus of our study. In this work, we propose SGSH--a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG. The framework incorporates "skeleton heuristics", which provides more fine-grained guidance associated with each input to stimulate LLMs to generate optimal questions, encompassing essential elements like the question phrase and the auxiliary verb.More specifically, we devise an automatic data construction strategy leveraging ChatGPT to construct a skeleton training dataset, based on which we employ a soft prompting approach to train a BART model dedicated to generating the skeleton associated with each input. Subsequently, skeleton heuristics are encoded into the prompt to incentivize GPT-3.5 to generate desired questions. Extensive experiments demonstrate that SGSH derives the new state-of-the-art performance on the KBQG tasks.
翻译:知识库问题生成(KBQG)旨在从知识库中提取的三元组事实中生成自然语言问题。现有方法通过预训练语言模型(PLMs)显著提升了KBQG的性能,这得益于其丰富的语义知识。随着预训练技术的进步,大型语言模型(LLMs)(如GPT-3.5)无疑拥有更丰富的语义知识。因此,如何有效组织和利用这些丰富知识进行KBQG成为本研究的核心。本文提出SGSH——一个简单而有效的框架,通过骨架启发式策略激发GPT-3.5以增强KBQG。该框架引入"骨架启发式",为每个输入提供更细粒度的指导信息,促使LLMs生成最优问题,涵盖疑问短语和助动词等关键要素。具体而言,我们设计了一种利用ChatGPT自动构建骨架训练数据集的策略,并基于该数据集采用软提示方法训练BART模型,使其能生成与每个输入对应的骨架。随后,将骨架启发式编码到提示中,激励GPT-3.5生成所需问题。大量实验表明,SGSH在KBQG任务上取得了新的最优性能。