Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context (e.g. a paragraph). This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education. In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning. Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches. However, there are differences depending on the underlying generative LM. Finally, our analysis shows that QA models fine-tuned solely on generated question-answer pairs can be competitive when compared to supervised QA models trained on human-labeled data.
翻译:问答生成是指给定上下文(如段落)生成一组问答对的任务。该任务具有多种应用场景,例如用于问答模型的数据增强、信息检索及教育领域。本文建立了三种基于序列到序列语言模型微调的不同问答生成方法的基线。实验表明,端到端问答生成模型在训练和推理阶段计算量较轻,通常具有鲁棒性,且优于其他更复杂的方法。然而,不同基础生成式语言模型之间存在性能差异。最后,我们的分析表明,仅使用生成的问答对微调的问答模型,在与人工标注数据训练的监督式问答模型竞争时,同样具有竞争力。