Large language models (LLMs) perform strongly across tasks and languages, yet how improvements in one task or language affect other tasks and languages remains poorly understood. We conduct a controlled LoRA fine-tuning study across multiple open-weight LLM families and scales, using a standardised grid of 11 languages and four benchmarks. We fine-tune each model on a single task-language source and measure transfer when evaluated on all other task-language target pairs. We decompose transfer into three regimes: (i) Matched-Task (Cross-Language), (ii) Matched-Language (Cross-Task), and (iii) Cross-Task (Cross-Language). Single-source fine-tuning yields a net positive uplift across regimes, but the gains are strongly asymmetric. Matched-Task (Cross-Language) transfer emerges as the most effective and predictable regime, driven principally by the identity of the target language rather than model architecture. We identify a stable hierarchy where high-resource languages and broad semantic tasks act as efficient recipients that absorb gains from diverse sources, while specialised tasks and lower-resource languages are more isolated. These results imply that effective fine-tuning requires navigating donor-recipient roles to maximise downstream gains.
翻译:大型语言模型(LLM)在不同任务和语言中均表现出强大性能,但某一任务或语言的性能提升如何影响其他任务与语言仍缺乏深入理解。本研究通过控制性LoRA微调实验,在多个开源权重LLM系列及规模上展开分析,采用包含11种语言和四项基准测试的标准化网格。我们对每个模型在单一任务-语言源上进行微调,并评估其在所有其他任务-语言目标组合上的迁移表现。我们将迁移分解为三种机制:(一)任务匹配(跨语言)迁移;(二)语言匹配(跨任务)迁移;(三)跨任务(跨语言)迁移。单源微调在三种机制中均产生净正向增益,但增益呈现显著非对称性。任务匹配(跨语言)迁移被证明是最有效且可预测的机制,其主要驱动力是目标语言的身份特征而非模型架构。我们识别出一个稳定层级:高资源语言与宽泛语义任务作为高效接收者,能够吸收来自不同源的增益;而专业化任务与低资源语言则相对孤立。这些结果表明,有效的微调需要协调施源者与接收者的角色,以最大化下游增益。