Human translators linger on some words and phrases more than others, and predicting this variation is a step towards explaining the underlying cognitive processes. Using data from the CRITT Translation Process Research Database, we evaluate the extent to which surprisal and attentional features derived from a Neural Machine Translation (NMT) model account for reading and production times of human translators. We find that surprisal and attention are complementary predictors of translation difficulty, and that surprisal derived from a NMT model is the single most successful predictor of production duration. Our analyses draw on data from hundreds of translators operating across 13 language pairs, and represent the most comprehensive investigation of human translation difficulty to date.
翻译:人工翻译者在某些词汇和短语上的停留时间多于其他词汇和短语,预测这种差异是解释其背后认知过程的关键一步。利用CRITT翻译过程研究数据库中的数据,我们评估了从神经机器翻译(NMT)模型中提取的惊奇值和注意力特征在解释人工翻译者阅读时间和产出时间方面的有效性。研究发现,惊奇值和注意力是翻译难度的互补预测因子,其中来自NMT模型的惊奇值是对产出时长最有效的单一预测指标。本分析涵盖了跨越13种语言对的数百名翻译者的数据,代表了迄今为止对人工翻译难度最全面的研究。