During language acquisition, children follow a typical sequence of learning stages, whereby they first learn to categorize phonemes before they develop their lexicon and eventually master increasingly complex syntactic structures. However, the computational principles that lead to this learning trajectory remain largely unknown. To investigate this, we here compare the learning trajectories of deep language models to those of children. Specifically, we test whether, during its training, GPT-2 exhibits stages of language acquisition comparable to those observed in children aged between 18 months and 6 years. For this, we train 48 GPT-2 models from scratch and evaluate their syntactic and semantic abilities at each training step, using 96 probes curated from the BLiMP, Zorro and BIG-Bench benchmarks. We then compare these evaluations with the behavior of 54 children during language production. Our analyses reveal three main findings. First, similarly to children, the language models tend to learn linguistic skills in a systematic order. Second, this learning scheme is parallel: the language tasks that are learned last improve from the very first training steps. Third, some - but not all - learning stages are shared between children and these language models. Overall, these results shed new light on the principles of language acquisition, and highlight important divergences in how humans and modern algorithms learn to process natural language.
翻译:在语言习得过程中,儿童遵循典型的学习阶段序列:他们首先学会对音素进行分类,随后发展词汇,最终掌握日益复杂的句法结构。然而,导致这一学习轨迹的计算原理在很大程度上尚不明确。为探究此问题,本文比较了深度语言模型与儿童的学习轨迹。具体而言,我们测试了GPT-2在训练过程中是否展现出与18个月至6岁儿童相似的语言习得阶段。为此,我们从零开始训练了48个GPT-2模型,并在每个训练步骤使用从BLiMP、Zorro和BIG-Bench基准中精选的96个探针,评估其句法和语义能力。随后,我们将这些评估结果与54名儿童在语言产出过程中的行为进行比较。分析揭示了三个主要发现。首先,与儿童类似,语言模型倾向于按系统顺序学习语言技能。其次,这种学习方案是并行的:最后学习的语言任务从最初的训练步骤开始便有所提升。第三,儿童与这些语言模型共享部分(但非全部)学习阶段。总体而言,这些结果为语言习得原理提供了新视角,并凸显了人类与现代算法在学习处理自然语言方面的重要差异。