Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models (PLMs). The increasing cognitive alignment of these models has made them candidates for cognitive science theories. Prior research into the emergent cognitive abilities of PLMs has largely been path-independent to model training, i.e., has focused on the final model weights and not the intermediate steps. However, building plausible models of human cognition using PLMs would benefit from considering the developmental alignment of their performance during training to the trajectories of children's thinking. Guided by psychometric tests of human intelligence, we choose four sets of tasks to investigate the alignment of ten popular families of PLMs and evaluate their available intermediate and final training steps. These tasks are Numerical ability, Linguistic abilities, Conceptual understanding, and Fluid reasoning. We find a striking regularity: regardless of model size, the developmental trajectories of PLMs consistently exhibit a window of maximal alignment to human cognitive development. Before that window, training appears to endow "blank slate" models with the requisite structure to be poised to rapidly learn from experience. After that window, training appears to serve the engineering goal of reducing loss but not the scientific goal of increasing alignment with human cognition.
翻译:近期研究表明,大型预训练语言模型(PLMs)中出现了认知能力的涌现现象。这些模型不断增强的认知对齐性使其成为认知科学理论的潜在研究对象。先前关于PLMs涌现认知能力的研究大多独立于模型训练路径,即主要关注最终模型权重而非中间训练阶段。然而,若要利用PLMs构建合理的人类认知模型,则需要考虑其在训练过程中表现出的发展轨迹与儿童思维发展路径的对齐程度。基于人类智力心理测量学的指导,我们选取四组任务来研究十个主流PLM系列的对齐性,并评估其可用的中间训练步骤与最终模型。这些任务包括:数值能力、语言能力、概念理解以及流体推理。我们发现了一个显著规律:无论模型规模大小,PLMs的发展轨迹始终存在一个与人类认知发展达到最大对齐的时间窗口。在该窗口期之前,训练似乎使"白板"模型具备了必要的结构基础,为快速从经验中学习做好了准备。而在该窗口期之后,训练似乎仅服务于降低损失值的工程目标,而非提升与人类认知对齐度的科学目标。