Current large language models (LLMs) often exhibit imbalances in multilingual capabilities and cultural adaptability, largely due to their English-centric pretraining data. To address this imbalance, we propose a probing method named XTransplant that explores cross-lingual latent interactions via cross-lingual feed-forward transplantation during inference stage, with the hope of enabling the model to leverage the strengths of both English and non-English languages. Through extensive pilot experiments, we empirically prove that both the multilingual capabilities and cultural adaptability of LLMs hold the potential to be significantly improved by XTransplant, respectively from En -> non-En and non-En -> En, highlighting the underutilization of current LLMs' multilingual potential. And the patterns observed in these pilot experiments further motivate an offline scaling inference strategy, which demonstrates consistent performance improvements in multilingual and culture-aware tasks, sometimes even surpassing multilingual supervised fine-tuning. And we do hope our further analysis and discussion could help gain deeper insights into XTransplant mechanism.
翻译:当前的大语言模型(LLMs)在多语言能力和文化适应性方面常表现出不均衡性,这主要归因于其以英语为中心的预训练数据。为应对这种不均衡,我们提出一种名为XTransplant的探测方法,该方法通过在推理阶段进行跨语言前馈移植来探索跨语言潜在交互,以期使模型能够同时利用英语与非英语语言的优势。通过大量先导实验,我们实证证明了LLMs的多语言能力与文化适应性均有潜力通过XTransplant分别从英语到非英语及非英语到英语方向获得显著提升,这凸显了当前LLMs多语言潜力尚未被充分利用的现状。这些先导实验中观察到的模式进一步启发了一种离线扩展推理策略,该策略在多语言及文化感知任务中展现出持续的性能提升,有时甚至超越多语言有监督微调的效果。我们期望通过进一步的分析与讨论,能够帮助更深入地理解XTransplant的内在机制。