While preliminary findings indicate that multilingual LLMs exhibit reduced bias compared to monolingual ones, a comprehensive understanding of the effect of multilingual training on bias mitigation, is lacking. This study addresses this gap by systematically training six LLMs of identical size (2.6B parameters) and architecture: five monolingual models (English, German, French, Italian, and Spanish) and one multilingual model trained on an equal distribution of data across these languages, all using publicly available data. To ensure robust evaluation, standard bias benchmarks were automatically translated into the five target languages and verified for both translation quality and bias preservation by human annotators. Our results consistently demonstrate that multilingual training effectively mitigates bias. Moreover, we observe that multilingual models achieve not only lower bias but also superior prediction accuracy when compared to monolingual models with the same amount of training data, model architecture, and size.
翻译:尽管初步研究表明多语言大语言模型相比单语言模型展现出更低的偏见水平,但关于多语言训练对偏见缓解效果的系统性认知仍存在空白。本研究通过系统训练六个规模相同(26亿参数)且架构一致的模型来填补这一空白:包括五个单语言模型(英语、德语、法语、意大利语和西班牙语)以及一个在等量跨语言数据上训练的多语言模型,所有训练数据均来自公开资源。为确保评估的严谨性,我们采用标准偏见基准测试集,将其自动翻译为五种目标语言,并通过人工标注验证了翻译质量与偏见表征的保真度。实验结果一致表明:多语言训练能有效缓解模型偏见。此外,我们发现多语言模型在保持相同训练数据量、模型架构和参数规模的前提下,不仅展现出更低的偏见水平,同时获得了优于单语言模型的预测准确率。