Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing

Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks, but they have yet to attain state-of-the-art performance in Neural Machine Translation (NMT). Nevertheless, their significant performance in tasks demanding a broad understanding and contextual processing shows their potential for translation. To exploit these abilities, we investigate using LLM's for MT and explore recent parameter-efficient fine-tuning techniques. Surprisingly, our initial experiments find that fine-tuning for translation purposes even led to performance degradation. To overcome this, we propose an alternative approach: adapting LLM's as Automatic Post-Editors (APE) rather than direct translators. Building on the LLM's exceptional ability to process and generate lengthy sequences, we also propose extending our approach to document-level translation. We show that leveraging Low-Rank-Adapter fine-tuning for APE can yield significant improvements across both sentence and document-level metrics while generalizing to out-of-domain data. Most notably, we achieve a state-of-the-art accuracy rate of 89\% on the ContraPro test set, which specifically assesses the model's ability to resolve pronoun ambiguities when translating from English to German. Lastly, we investigate a practical scenario involving manual post-editing for document-level translation, where reference context is made available. Here, we demonstrate that leveraging human corrections can significantly reduce the number of edits required for subsequent translations (Interactive Demo for integrating manual feedback can be found here: https://huggingface.co/spaces/skoneru/contextual_refinement_ende).

翻译：大型语言模型（LLM）在各种自然语言处理任务中展现了显著成功，但在神经机器翻译（NMT）领域尚未达到最新技术水平。然而，它们在需要广泛理解和上下文处理的任务中的卓越表现，揭示了其在翻译领域的潜力。为利用这些能力，我们研究了将LLM用于机器翻译，并探索了近期参数高效微调技术。令人惊讶的是，我们的初步实验发现，针对翻译任务的微调甚至导致性能下降。为解决这一问题，我们提出了一种替代方法：将LLM适配为自动后编辑（APE）而非直接翻译器。基于LLM在处理和生成长序列方面的卓越能力，我们还提出将方法扩展到文档级翻译。研究表明，利用低秩适配器微调进行APE，可以在句子级和文档级指标上取得显著改进，同时泛化到域外数据。最值得注意的是，我们在ContraPro测试集上达到了89%的最新准确率，该测试集专门评估模型在英译德时解决代词歧义的能力。最后，我们研究了涉及文档级翻译手动后编辑的实际场景，其中参考上下文可用。在此，我们证明利用人工修正可以显著减少后续翻译所需的编辑次数（集成手动反馈的交互式演示见此链接：https://huggingface.co/spaces/skoneru/contextual_refinement_ende）。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日