Recent advancements in artificial intelligence have sparked interest in the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. While prior research has established similarities in the representation of LLMs and the brain, the underlying computational principles that cause this convergence, especially in the context of evolving LLMs, remain elusive. Here, we examined a diverse selection of high-performance LLMs with similar parameter sizes to investigate the factors contributing to their alignment with the brain's language processing mechanisms. We find that as LLMs achieve higher performance on benchmark tasks, they not only become more brain-like as measured by higher performance when predicting neural responses from LLM embeddings, but also their hierarchical feature extraction pathways map more closely onto the brain's while using fewer layers to do the same encoding. We also compare the feature extraction pathways of the LLMs to each other and identify new ways in which high-performing models have converged toward similar hierarchical processing mechanisms. Finally, we show the importance of contextual information in improving model performance and brain similarity. Our findings reveal the converging aspects of language processing in the brain and LLMs and offer new directions for developing models that align more closely with human cognitive processing.
翻译:人工智能的最新进展激发了对于大语言模型(LLMs)与人类神经处理过程(尤其是语言理解方面)相似性的兴趣。尽管已有研究证实了LLMs与大脑在表征上的相似性,但其趋同背后的计算原理,尤其是在不断演进的LLMs背景下,仍不明确。本研究考察了参数规模相近的多种高性能LLMs,以探究促成其与大脑语言处理机制对齐的因素。我们发现,当LLMs在基准任务上取得更高性能时,它们不仅变得更像大脑(即从LLM嵌入预测神经反应时性能更高),而且其层次化特征提取路径也更紧密地映射到大脑的相应路径,同时使用更少的层完成相同的编码。我们还比较了不同LLMs之间的特征提取路径,并识别出高性能模型在趋近相似层次化处理机制方面的新方式。最后,我们展示了上下文信息在提升模型性能和大脑相似性中的重要性。我们的发现揭示了大脑与LLMs在语言处理上的趋同方面,并为开发更贴近人类认知处理过程的模型提供了新方向。