Question Generation (QG) is a task of Natural Language Processing (NLP) that aims at automatically generating questions from text. Many applications can benefit from automatically generated questions, but often it is necessary to curate those questions, either by selecting or editing them. This task is informative on its own, but it is typically done post-generation, and, thus, the effort is wasted. In addition, most existing systems cannot incorporate this feedback back into them easily. In this work, we present a system, GEN, that learns from such (implicit) feedback. Following a pattern-based approach, it takes as input a small set of sentence/question pairs and creates patterns which are then applied to new unseen sentences. Each generated question, after being corrected by the user, is used as a new seed in the next iteration, so more patterns are created each time. We also take advantage of the corrections made by the user to score the patterns and therefore rank the generated questions. Results show that GEN is able to improve by learning from both levels of implicit feedback when compared to the version with no learning, considering the top 5, 10, and 20 questions. Improvements go up from 10%, depending on the metric and strategy used.
翻译:问题生成(QG)是自然语言处理(NLP)中的一项任务,旨在从文本中自动生成问题。许多应用可从自动生成的问题中受益,但通常需要对这些问题进行筛选或编辑以进行整理。这一整理过程本身具有信息价值,但通常是在生成后进行的,因此其努力被白白浪费。此外,大多数现有系统无法轻易地将这种反馈重新纳入自身。在本研究中,我们提出一个名为GEN的系统,它能从这种(隐式)反馈中学习。该系统遵循基于模式的方法,以少量句子/问题对作为输入,创建模式,并将其应用于新的未见句子。每个生成的问题经用户纠正后,会作为下一轮迭代的新种子,从而每次生成更多模式。我们还利用用户的纠正对模式进行评分,进而对生成的问题进行排序。结果表明,与无学习版本相比,GEN能够通过从两种层次的隐式反馈中学习来提高性能(考虑前5、10和20个问题)。根据所使用的指标和策略,改进幅度可达10%以上。