With the rise of Large Language Models (LLMs), the novel metric "Brainscore" emerged as a means to evaluate the functional similarity between LLMs and human brain/neural systems. Our efforts were dedicated to mining the meaning of the novel score by constructing topological features derived from both human fMRI data involving 190 subjects, and 39 LLMs plus their untrained counterparts. Subsequently, we trained 36 Linear Regression Models and conducted thorough statistical analyses to discern reliable and valid features from our constructed ones. Our findings reveal distinctive feature combinations conducive to interpreting existing brainscores across various brain regions of interest (ROIs) and hemispheres, thereby significantly contributing to advancing interpretable machine learning (iML) studies. The study is enriched by our further discussions and analyses concerning existing brainscores. To our knowledge, this study represents the first attempt to comprehend the novel metric brainscore within this interdisciplinary domain.
翻译:随着大型语言模型(LLMs)的兴起,新型指标“脑评分”(Brainscore)应运而生,用于评估LLMs与人类大脑/神经系统之间的功能相似性。本研究致力于通过构建拓扑特征来挖掘这一新型评分的含义——这些特征源自190名受试者的人类fMRI数据,以及39个LLM及其未训练版本的对应数据。随后,我们训练了36个线性回归模型,并进行了彻底的统计分析,以从构建的特征中甄别出可靠有效的特征。研究结果揭示了能够解释不同脑区(ROI)和半球现有脑评分的独特特征组合,从而显著推动了可解释机器学习(iML)研究的发展。通过进一步对现有脑评分的讨论与分析,本研究的内容得以丰富。据我们所知,本研究是该跨学科领域中首次尝试理解新型指标“脑评分”的探索。