Zero-shot In-context learning is the phenomenon where models can perform the task simply given the instructions. However, pre-trained large language models are known to be poorly calibrated for this task. One of the most effective approaches to handling this bias is to adopt a contrastive decoding objective, which accounts for the prior probability of generating the next token by conditioning on some context. This work introduces an Anti-Language Model objective with a decay factor designed to address the weaknesses of In-context Machine Translation. We conduct our experiments across 3 model types and sizes, 3 language directions, and for both greedy decoding and beam search ($B=5$). The proposed method outperforms other state-of-art decoding objectives, with up to $20$ BLEU point improvement from the default objective observed in some settings.
翻译:零样本上下文学习是指模型仅凭指令即可执行任务的现象。然而,预训练大语言模型在此任务上的校准效果较差。处理该偏差的最有效方法之一是采用对比解码目标,该目标通过基于某些上下文信息来调节生成下一词元的先验概率。本文引入了一种带有衰减因子的反语言模型目标,旨在解决上下文机器翻译的缺陷。我们在3种模型类型与规模、3种语言方向以及贪婪解码和束搜索(B=5)两种解码策略上进行了实验。所提方法优于其他最先进的解码目标,在某些设置下相比默认目标获得了高达20个BLEU点的提升。