Beyond Accuracy: Community Perspectives on Machine Translation

Despite remarkable progress in machine translation (MT), non-AI communities have raised growing concerns about MT systems, suggesting a noticeable gap between technical advancement and the needs of real-world users. For instance, while NLP researchers focus on benchmark performance, end users care about ethical concerns, trust, reliability, costs, and more. We argue that listening to various user communities is essential so that research efforts would be directed towards the problems that the communities care about. To this end, we present a large-scale analysis, for the first time, that investigates what four stakeholder communities (AI developers, professional translators, language learners, and language service providers) post about MT technology on social media. To do so, we construct a dataset of 79,286 posts and comments from Reddit, Facebook, Bluesky, and Mastodon from 2019 to 2025, and analyse where these communities disagree, and how and why. Overall, we find that communities often disagree, and even show strong conflicts due to polarised sentiments on topics such as translation quality, efficiency, and reliability. This is because these communities approach these topics differently: the AI community frames them as technical and computational problems, while non-AI (user) communities care more about quality nuances, time savings, user trust, and broader social issues.

翻译：尽管机器翻译（MT）取得了显著进展，但非人工智能社区对MT系统的担忧日益增加，这表明技术进步与实际用户需求之间存在显著差距。例如，虽然自然语言处理研究人员聚焦于基准性能，但最终用户更关心伦理问题、信任度、可靠性、成本等多个方面。我们认为，倾听不同用户社区的声音至关重要，以便研究工作能聚焦于社区所关注的问题。为此，我们首次开展大规模分析，探究四个利益相关社区（人工智能开发者、专业翻译人员、语言学习者及语言服务提供商）在社交媒体上关于MT技术的发帖内容。我们构建了一个包含2019年至2025年间来自Reddit、Facebook、Bluesky和Mastodon平台的79,286条帖子及评论的数据集，并分析这些社区之间存在分歧的领域、分歧方式及原因。总体而言，我们发现在翻译质量、效率、可靠性等议题上，社区间常存在分歧，甚至因两极分化的情感而引发强烈冲突。这源于各社区处理这些议题的视角差异：人工智能社区将其视为技术与计算问题，而非人工智能（用户）社区则更关注质量细节、时间节省、用户信任及更广泛的社会议题。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

文档级神经机器翻译综述

专知会员服务

13+阅读 · 2024年8月29日

「机器翻译评测研究」最新2022综述

专知会员服务

37+阅读 · 2022年3月13日