We explore the impact of multi-source input strategies on machine translation (MT) quality, comparing GPT-4o, a large language model (LLM), with a traditional multilingual neural machine translation (NMT) system. Using intermediate language translations as contextual cues, we evaluate their effectiveness in enhancing English and Chinese translations into Portuguese. Results suggest that contextual information significantly improves translation quality for domain-specific datasets and potentially for linguistically distant language pairs, with diminishing returns observed in benchmarks with high linguistic variability. Additionally, we demonstrate that shallow fusion, a multi-source approach we apply within the NMT system, shows improved results when using high-resource languages as context for other translation pairs, highlighting the importance of strategic context language selection.
翻译:本研究探讨了多源输入策略对机器翻译质量的影响,对比了大型语言模型GPT-4o与传统多语言神经机器翻译系统。通过将中间语言翻译作为上下文线索,我们评估了其在提升英语和汉语翻译为葡萄牙语任务中的有效性。结果表明,上下文信息能显著提升领域特定数据集的翻译质量,对语言距离较远的语言对也可能产生积极效果,但在语言变异性较高的基准测试中观察到收益递减现象。此外,我们证明在NMT系统中应用的多源方法——浅层融合,当使用高资源语言作为其他翻译对的上下文时能取得更优结果,这凸显了策略性上下文语言选择的重要性。