(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts

Literary translation remains one of the most challenging frontiers in machine translation due to the complexity of capturing figurative language, cultural nuances, and unique stylistic elements. In this work, we introduce TransAgents, a novel multi-agent framework that simulates the roles and collaborative practices of a human translation company, including a CEO, Senior Editor, Junior Editor, Translator, Localization Specialist, and Proofreader. The translation process is divided into two stages: a preparation stage where the team is assembled and comprehensive translation guidelines are drafted, and an execution stage that involves sequential translation, localization, proofreading, and a final quality check. Furthermore, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP), which evaluates translations based solely on target language quality and cultural appropriateness, and Bilingual LLM Preference (BLP), which leverages large language models like GPT-4} for direct text comparison. Although TransAgents achieves lower d-BLEU scores, due to the limited diversity of references, its translations are significantly better than those of other baselines and are preferred by both human evaluators and LLMs over traditional human references and GPT-4} translations. Our findings highlight the potential of multi-agent collaboration in enhancing translation quality, particularly for longer texts.

翻译：文学翻译因其在捕捉比喻性语言、文化细微差别和独特文体元素方面的复杂性，仍然是机器翻译最具挑战性的领域之一。在本研究中，我们引入了TransAgents，一种新颖的多智能体框架，它模拟了人类翻译公司的角色与协作实践，包括首席执行官、高级编辑、初级编辑、翻译员、本地化专家和校对员。翻译过程分为两个阶段：准备阶段，即组建团队并起草全面的翻译指南；以及执行阶段，该阶段涉及顺序性的翻译、本地化、校对和最终质量检查。此外，我们提出了两种创新的评估策略：单语人类偏好（MHP），它仅基于目标语言质量和文化适宜性来评估翻译；以及双语大语言模型偏好（BLP），它利用如GPT-4等大语言模型进行直接的文本比较。尽管由于参考译文的多样性有限，TransAgents获得了较低的d-BLEU分数，但其翻译质量显著优于其他基线模型，并且在人类评估者和LLM的偏好中，均优于传统的人类参考译文和GPT-4的翻译。我们的研究结果突显了多智能体协作在提升翻译质量，尤其是长文本翻译质量方面的潜力。

相关内容

GPT-4

关注 29

北京时间2023年3月15日凌晨，ChatGPT开发商OpenAI 发布了发布了全新的多模态预训练大模型 GPT-4，可以更可靠、更具创造力、能处理更细节的指令，根据图片和文字提示都能生成相应内容。具体来说来说，GPT-4 相比上一代的模型，实现了飞跃式提升：支持图像和文本输入，拥有强大的识图能力；大幅提升了文字输入限制，在ChatGPT模式下，GPT-4可以处理超过2.5万字的文本，可以处理一些更加细节的指令；回答准确性也得到了显著提高。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日