To translate well, machine translation (MT) systems and general-purposed language models (LMs) need a deep understanding of both source and target languages and cultures. Therefore, idioms, with their non-compositional nature, pose particular challenges for Transformer-based systems, as literal translations often miss the intended meaning. Traditional methods, which replace idioms using existing knowledge bases (KBs), often lack scale and context awareness. Addressing these challenges, our approach prioritizes context awareness and scalability, allowing for offline storage of idioms in a manageable KB size. This ensures efficient serving with smaller models and provides a more comprehensive understanding of idiomatic expressions. We introduce a multilingual idiom KB (IdiomKB) developed using large LMs to address this. This KB facilitates better translation by smaller models, such as BLOOMZ (7.1B), Alpaca (7B), and InstructGPT (6.7B), by retrieving idioms' figurative meanings. We present a novel, GPT-4-powered metric for human-aligned evaluation, demonstrating that IdiomKB considerably boosts model performance. Human evaluations further validate our KB's quality.
翻译:为提升翻译质量,机器翻译系统与通用语言模型需深刻理解源语言与目标语言及其文化背景。因此,具有非组合性特征的习语对基于Transformer的系统构成特殊挑战——字面翻译常偏离原意。传统方法依赖现有知识库替换习语,却普遍缺乏规模性与语境感知能力。针对这些挑战,本文方法优先考虑语境感知与可扩展性,通过构建规模可控的知识库实现习语离线存储。这不仅保障小模型的高效服务能力,更提供对习语表达的深度理解。我们提出采用大型语言模型开发的多语言习语知识库(IdiomKB)。该知识库通过检索习语比喻义,助力BLOOMZ (7.1B)、Alpaca (7B)及InstructGPT (6.7B)等小模型显著提升翻译效果。我们引入基于GPT-4的新型评估指标,其与人工评估高度契合,实验表明IdiomKB有效提升模型性能。人工评估进一步验证了该知识库的质量。