Uniform Meaning Representation (UMR) is a recently developed graph-based semantic representation, which expands on Abstract Meaning Representation (AMR) in a number of ways, in particular through the inclusion of document-level information and multilingual flexibility. In order to effectively adopt and leverage UMR for downstream tasks, efforts must be placed toward developing a UMR technological ecosystem. Though only a small amount of UMR annotations have been produced to date, in this work, we investigate the first approaches to producing text from multilingual UMR graphs. Exploiting the structural similarity between UMR and AMR graphs and the wide availability of AMR technologies, we introduce (1) a baseline approach which passes UMR graphs to AMR-to-text generation models, (2) a pipeline conversion of UMR to AMR, then using AMR-to-text generation models, and (3) a fine-tuning approach for both foundation models and AMR-to-text generation models with UMR data. Our best performing models achieve multilingual BERTscores of 0.825 for English and 0.882 for Chinese, a promising indication of the effectiveness of fine-tuning approaches for UMR-to-text generation even with limited UMR data.
翻译:统一意义表示(UMR)是一种新近开发的基于图的语义表示方法,它在多个方面扩展了抽象意义表示(AMR),特别是通过纳入文档级信息和多语言灵活性。为了在下游任务中有效采用并利用UMR,必须着力构建UMR技术生态系统。尽管目前仅产生了少量UMR标注数据,本研究首次探索了从多语言UMR图生成文本的方法。利用UMR与AMR图之间的结构相似性以及AMR技术的广泛可用性,我们提出了三种方法:(1)将UMR图直接输入AMR到文本生成模型的基线方法;(2)先将UMR转换为AMR,再使用AMR到文本生成模型的流水线方法;(3)利用UMR数据对基础模型和AMR到文本生成模型进行微调的方法。我们性能最佳的模型在英语和中文上分别实现了0.825和0.882的多语言BERTscore,这表明即使在UMR数据有限的情况下,微调方法对UMR到文本生成任务也具有显著效果。