We present FRMT, a new dataset and evaluation benchmark for Few-shot Region-aware Machine Translation, a type of style-targeted translation. The dataset consists of professional translations from English into two regional variants each of Portuguese and Mandarin Chinese. Source documents are selected to enable detailed analysis of phenomena of interest, including lexically distinct terms and distractor terms. We explore automatic evaluation metrics for FRMT and validate their correlation with expert human evaluation across both region-matched and mismatched rating scenarios. Finally, we present a number of baseline models for this task, and offer guidelines for how researchers can train, evaluate, and compare their own models. Our dataset and evaluation code are publicly available: https://bit.ly/frmt-task
翻译:我们提出了FRMT,一个面向少样本区域感知机器翻译(一种风格导向翻译)的新数据集与评估基准。该数据集包含从英语到葡萄牙语和普通话各自两种区域变体的专业翻译。源文档经过精心筛选,能够对词汇差异项和干扰项等关键语言现象进行详细分析。我们探索了FRMT的自动评估指标,并通过专家人工评估验证了其在区域匹配与非匹配评分场景下的相关性。最后,我们为此任务提供了多个基线模型,并给出了研究者训练、评估及比较自建模型的指导方案。本数据集与评估代码已公开:https://bit.ly/frmt-task