Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. Several recent efforts attempt to address this by devising models that automatically detect factual inconsistencies in machine generated summaries. However, they focus exclusively on English, a language with abundant resources. In this work, we leverage factual consistency evaluation models to improve multilingual summarization. We explore two intuitive approaches to mitigate hallucinations based on the signal provided by a multilingual NLI model, namely data filtering and controlled generation. Experimental results in the 45 languages from the XLSum dataset show gains over strong baselines in both automatic and human evaluation.
翻译:抽象式摘要近年来因预训练语言模型和大规模数据集的出现而重获关注。尽管取得了令人鼓舞的结果,当前模型仍存在生成事实不一致摘要的问题,这降低了其在实际应用中的效用。近期研究尝试通过设计自动检测机器生成摘要中事实不一致性的模型来解决此问题,但这些研究仅聚焦于资源丰富的英语语言。在本工作中,我们利用事实一致性评估模型改进多语言摘要生成。我们探索了两种基于多语言NLI模型信号来减少幻觉的直观方法:数据过滤和受控生成。在XLSum数据集的45种语言上的实验结果表明,在自动评估和人工评估中,我们的方法均优于强基线模型。