The problem of hallucination and omission, a long-standing problem in machine translation (MT), is more pronounced when a large language model (LLM) is used in MT because an LLM itself is susceptible to these phenomena. In this work, we mitigate the problem in an LLM-based MT model by guiding it to better word alignment. We first study the correlation between word alignment and the phenomena of hallucination and omission in MT. Then we propose to utilize word alignment as preference to optimize the LLM-based MT model. The preference data are constructed by selecting chosen and rejected translations from multiple MT tools. Subsequently, direct preference optimization is used to optimize the LLM-based model towards the preference signal. Given the absence of evaluators specifically designed for hallucination and omission in MT, we further propose selecting hard instances and utilizing GPT-4 to directly evaluate the performance of the models in mitigating these issues. We verify the rationality of these designed evaluation methods by experiments, followed by extensive results demonstrating the effectiveness of word alignment-based preference optimization to mitigate hallucination and omission.
翻译:机器翻译中长期存在的幻觉与漏译问题,在大语言模型应用于机器翻译时更为突出,因其自身易受此类现象影响。本研究通过引导基于大语言模型的机器翻译模型实现更优的词语对齐,以缓解该问题。我们首先探究词语对齐与机器翻译中幻觉和漏译现象的相关性,继而提出将词语对齐作为偏好信号来优化基于大语言模型的机器翻译模型。偏好数据通过从多个机器翻译工具中筛选"采纳"与"拒绝"译文构建,并采用直接偏好优化算法使大语言模型向该偏好信号优化。鉴于当前缺乏专用于机器翻译幻觉与漏译的评估工具,我们进一步提出选取困难实例,利用GPT-4直接评估模型缓解上述问题的能力。通过实验验证了所设计评估方法的合理性,大量结果证实了基于词语对齐的偏好优化在缓解幻觉与漏译方面的有效性。