Despite the centrality of crosslinguistic influence (CLI) to bilingualism research, human studies often yield conflicting results due to inherent experimental variance. We address these inconsistencies by using language models (LMs) as controlled statistical learners to systematically simulate CLI and isolate its underlying drivers. Specifically, we study the effect of varying the L1 language dominance and the L2 language proficiency, which we manipulate by controlling the L2 age of exposure -- defined as the training step at which the L2 is introduced. Furthermore, we investigate the impact of pretraining on L1 languages with varying syntactic distance from the L2. Using cross-linguistic priming, we analyze how activating L1 structures impacts L2 processing. Our results align with evidence from psycholinguistic studies, confirming that language dominance and proficiency are strong predictors of CLI. We further find that while priming of grammatical structures is bidirectional, the priming of ungrammatical structures is sensitive to language dominance. Finally, we provide mechanistic evidence of CLI in LMs, demonstrating that the L1 is co-activated during L2 processing and directly influences the neural circuitry recruited for the L2. More broadly, our work demonstrates that LMs can serve as a computational framework to inform theories of human CLI.
翻译:尽管跨语言影响(CLI)在双语研究中处于核心地位,但人类研究常因固有的实验方差而产生相互矛盾的结果。我们通过将语言模型(LMs)作为受控的统计学习者来系统性地模拟CLI并分离其潜在驱动因素,从而解决这些不一致性。具体而言,我们研究了改变L1语言优势度和L2语言熟练度的影响,这是通过控制L2接触年龄(定义为引入L2的训练步骤)来操纵的。此外,我们还研究了在句法距离与L2不同的L1语言上进行预训练的影响。利用跨语言启动效应,我们分析了激活L1结构如何影响L2处理。我们的结果与心理语言学研究的证据一致,证实了语言优势度和熟练度是CLI的强预测因子。我们进一步发现,虽然语法结构的启动是双向的,但不合语法结构的启动对语言优势度敏感。最后,我们提供了LMs中CLI的机制性证据,表明L1在L2处理过程中被共同激活,并直接影响为L2征募的神经回路。更广泛地说,我们的工作表明,LMs可以作为一个计算框架,为人类CLI理论提供信息。