Massively multilingual language models enable cross-lingual generalization but underperform on low-resource and unseen languages. While adapter-based fine-tuning offers a parameter-efficient solution, training language-specific adapters at scale remains costly. We introduce Typologically Informed Parameter Aggregation (TIPA), a training-free method that constructs proxy language adapters by aggregating existing ones, weighted by typological similarity. Integrated into the MAD-X framework, these proxies enable zero-shot cross-lingual transfer without additional training. We evaluate TIPA on five NLP tasks and over 230 languages. TIPA consistently outperforms or matches baselines such as English-only fine-tuning or selecting the typologically closest language adapter. We see the largest gains for languages lacking dedicated adapters. Our results demonstrate that typologically informed aggregation provides a viable alternative to language-specific modules without any training needed.
翻译:大规模多语言模型虽能实现跨语言泛化,但在低资源及未见语言上表现欠佳。基于适配器的微调提供了参数高效的解决方案,但大规模训练语言专用适配器成本高昂。我们提出类型学启发的参数聚合(TIPA),这是一种无需训练的方法,通过按类型学相似度加权聚合现有适配器来构建代理语言适配器。这些代理适配器集成至MAD-X框架后,可实现无需额外训练的零样本跨语言迁移。我们在五项NLP任务及超过230种语言上评估TIPA。TIPA始终优于或匹配仅用英语微调、选择类型学最接近语言适配器等基线方法,在缺乏专用适配器的语言上获得最大性能提升。结果表明,类型学启发的聚合为语言专用模块提供了无需任何训练即可实现的可行替代方案。