LENS: A Learnable Evaluation Metric for Text Simplification

Training learnable metrics using modern language models has recently emerged as a promising method for the automatic evaluation of machine translation. However, existing human evaluation datasets for text simplification have limited annotations that are based on unitary or outdated models, making them unsuitable for this approach. To address these issues, we introduce the SimpEval corpus that contains: SimpEval_past, comprising 12K human ratings on 2.4K simplifications of 24 past systems, and SimpEval_2022, a challenging simplification benchmark consisting of over 1K human ratings of 360 simplifications including GPT-3.5 generated text. Training on SimpEval, we present LENS, a Learnable Evaluation Metric for Text Simplification. Extensive empirical results show that LENS correlates much better with human judgment than existing metrics, paving the way for future progress in the evaluation of text simplification. We also introduce Rank and Rate, a human evaluation framework that rates simplifications from several models in a list-wise manner using an interactive interface, which ensures both consistency and accuracy in the evaluation process and is used to create the SimpEval datasets.

翻译：利用现代语言模型训练可学习指标，近来已成为机器翻译自动评估的一种有前景的方法。然而，现有文本简化人工评估数据集基于单一或过时模型，标注有限，不适用于这一方法。为解决这些问题，我们引入了SimpEval语料库，该语料库包含：SimpEval_past（涵盖24个过去系统对2.4万条简化文本的1.2万个人工评分）和SimpEval_2022（一个具有挑战性的简化基准，包含对360条简化文本（含GPT-3.5生成文本）的1000多个人工评分）。基于SimpEval训练，我们提出了LENS——一种可学习的文本简化评估指标。大量实证结果表明，LENS与人类判断的相关性远高于现有指标，为文本简化评估的未来进展铺平了道路。我们还引入了Rank and Rate，一种人工评估框架，该框架通过交互式界面以列表方式对多个模型的简化结果进行评分，确保了评估过程的一致性和准确性，并用于创建SimpEval数据集。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日