Automatic evaluation for sentence simplification remains a challenging problem. Most popular evaluation metrics require multiple high-quality references -- something not readily available for simplification -- which makes it difficult to test performance on unseen domains. Furthermore, most existing metrics conflate simplicity with correlated attributes such as fluency or meaning preservation. We propose a new learned evaluation metric (SLE) which focuses on simplicity, outperforming almost all existing metrics in terms of correlation with human judgements.
翻译:句子简化的自动评估仍是一个具有挑战性的问题。大多数流行的评估指标需要多个高质量参考——这在简化任务中往往难以获得——从而难以测试在未见领域上的性能。此外,现有大多数指标将简单性与流畅性、意义保留等相关属性混为一谈。我们提出一种新的学习型评估指标(SLE),该指标专注于简单性,在与人类判断的相关性方面优于几乎所有现有指标。