Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilingual model representations can be language-agnostic, it remains unclear whether the same holds for function vectors. We study whether FVs exhibit language-agnosticity, using machine translation as a case study. Across three decoder-only multilingual LLMs, we find that translation FVs extracted from a single English$\rightarrow$Target direction transfer to other target languages, consistently improving the rank of correct translation tokens across multiple unseen languages. Ablation results show that removing the FV degrades translation across languages with limited impact on unrelated tasks. We further show that base-model FVs transfer to instruction-tuned variants and partially generalize from word-level to sentence-level translation.
翻译:函数向量(FVs)是在上下文学习过程中从模型激活中提取的任务向量表示。虽然先前研究表明多语言模型的表示可以具有语言无关性,但函数向量是否同样如此尚不清楚。我们以机器翻译为案例,研究FVs是否展现语言无关性。在三个仅含解码器的多语言大语言模型中,我们发现从单一英语→目标语言方向提取的翻译FVs可迁移至其他目标语言,持续提升多个未见语言中正确翻译标记的排序。消融实验表明,移除FVs会降低跨语言翻译性能,但对无关任务影响有限。我们进一步证明,基础模型的FVs可迁移至指令微调变体,并能在一定程度上从词级翻译泛化到句级翻译。