Sentence simplification tends to focus on the generic simplification of sentences by making them more readable and easier to understand. This paper provides a dataset aimed at training models that perform subject aware sentence simplifications rather than simplifying sentences as a whole. We also test models on that dataset which are inspired by model architecture used in abstractive summarization. We hand generated portions of the data and augment the dataset by further manipulating those hand written simplifications. Our results show that data-augmentation, data-masking, and model architecture choices used in summarization provide a solid baseline for comparison on subject aware simplification.
翻译:句子简化通常致力于通过提升可读性和理解难度来实现通用型句子简化。本文提供了一个数据集,旨在训练能够执行主题感知句子简化(而非整体简化句子)的模型。我们还基于抽象式摘要中使用的模型架构在该数据集上测试了模型。我们手动生成了部分数据,并通过进一步处理这些手写简化来扩充数据集。实验结果表明,摘要任务中使用的数据增强、数据掩码及模型架构选择为主题感知简化任务提供了可靠的对比基线。