Given a possibly false claim sentence, how can we automatically correct it with minimal editing? Existing methods either require a large number of pairs of false and corrected claims for supervised training or do not handle well errors spanning over multiple tokens within an utterance. In this paper, we propose VENCE, a novel method for factual error correction (FEC) with minimal edits. VENCE formulates the FEC problem as iterative sampling editing actions with respect to a target density function. We carefully design the target function with predicted truthfulness scores from an offline trained fact verification model. VENCE samples the most probable editing positions based on back-calculated gradients of the truthfulness score concerning input tokens and the editing actions using a distantly-supervised language model (T5). Experiments on a public dataset show that VENCE improves the well-adopted SARI metric by 5.3 (or a relative improvement of 11.8%) over the previous best distantly-supervised methods.
翻译:给定一个可能为假的陈述句,我们如何以最小编辑量自动纠正它?现有方法要么需要大量错误和纠正后的成对陈述进行监督训练,要么无法很好地处理一个话语中跨越多个标记的错误。在本文中,我们提出VENCE,一种用于最小编辑量事实错误纠正(FEC)的新颖方法。VENCE将FEC问题形式化为针对目标密度函数进行迭代采样编辑动作。我们精心设计目标函数,利用来自离线训练的事实验证模型预测的真实性分数。VENCE基于真实性分数关于输入标记的反向计算梯度,以及使用远程监督语言模型(T5)的编辑动作,采样最可能的编辑位置。在公共数据集上的实验表明,与之前最佳的远程监督方法相比,VENCE将广泛采用的SARI指标提高了5.3(相对提升11.8%)。