Translations systematically diverge from texts originally produced in the target language, a phenomenon widely referred to as translationese. Translationese has been attributed to production tendencies (e.g. interference, simplification), socio-cultural variables, and language-pair effects, yet a unified explanatory account is still lacking. We propose that translationese reflects cognitive load inherent in the translation task itself. We test whether observable translationese can be predicted from quantifiable measures of translation task difficulty. Translationese is operationalised as a segment-level translatedness score produced by an automatic classifier. Translation task difficulty is conceptualised as comprising source-text and cross-lingual transfer components, operationalised mainly through information-theoretic metrics based on LLM surprisal, complemented by established syntactic and semantic alternatives. We use a bidirectional English-German corpus comprising written and spoken subcorpora. Results indicate that translationese can be partly explained by translation task difficulty, especially in English-to-German. For most experiments, cross-lingual transfer difficulty contributes more than source-text complexity. Information-theoretic indicators match or outperform traditional features in written mode, but offer no advantage in spoken mode. Source-text syntactic complexity and translation-solution entropy emerged as the strongest predictors of translationese across language pairs and modes.
翻译:翻译文本与目标语言原创文本存在系统性差异,这一现象被广泛称为翻译腔。翻译腔常被归因于产出倾向(如干扰、简化)、社会文化变量及语言对效应,但至今仍缺乏统一的解释框架。我们提出翻译腔反映了翻译任务本身固有的认知负荷。我们通过量化指标测试可观测的翻译腔是否能够从翻译任务难度中预测。翻译腔被操作化为自动分类器生成的片段级翻译度分数。翻译任务难度被概念化为包含源文本难度和跨语言迁移难度两个组成部分,主要通过基于大语言模型惊异值的信息论指标进行量化,并辅以成熟的句法和语义替代指标。我们使用包含书面语和口语子库的双向英德平行语料库进行实验。结果表明翻译腔可部分由翻译任务难度解释,尤其在英译德方向。在多数实验中,跨语言迁移难度比源文本复杂度具有更强的解释力。信息论指标在书面语模式下与传统特征表现相当或更优,但在口语模式下未显现优势。源文本句法复杂度和翻译方案熵在跨语言对与跨模态实验中均表现出最强的翻译腔预测能力。