In this paper, we apply transformer-based Natural Language Generation (NLG) techniques to the problem of text simplification. Currently, there are only a few German datasets available for text simplification, even fewer with larger and aligned documents, and not a single one with narrative texts. In this paper, we explore to which degree modern NLG techniques can be applied to German narrative text simplifications. We use Longformer attention and a pre-trained mBART model. Our findings indicate that the existing approaches for German are not able to solve the task properly. We conclude on a few directions for future research to address this problem.
翻译:本文应用基于Transformer的自然语言生成(NLG)技术解决文本简化问题。目前,德文文本简化领域仅有少量数据集可用,其中包含大规模对齐文档的数据集更为稀少,且尚无针对叙事文本的专用数据集。本研究探讨了现代NLG技术在德文叙事文本简化中的适用程度。我们采用了Longformer注意力机制与预训练mBART模型。研究结果表明,现有德文处理方法无法妥善完成该任务。最后,我们提出了若干未来研究方向以应对这一挑战。