The correspondence between large language models (LLMs) and the neural mechanisms underlying human higher-order cognition remains insufficiently characterized. Given that language and reasoning in the human brain appear dissociable, an open question is whether LLMs align with neural signals from reasoning-related regions and whether such signals can improve them. Here, focusing on deductive reasoning, we show that LLM internal representations are not only partially aligned with task-fMRI activity but can also be directly enhanced by these signals. Using a neural-predictivity metric, we find that LLMs explain a substantial fraction of the explainable variance in reasoning-related regions at the aggregate level, whereas predictivity within specific reasoning types is lower, indicating both alignment and divergence. Building on this, we propose a brain-guided framework: we steer model representations along directions induced by the joint structure of model and brain representations, applying intervention at inference and fine-tuning during training. We demonstrate that task-evoked brain signals can directly enhance LLM reasoning, yielding gains orthogonal to language-only supervision across 10 LLMs (1.5B-72B), with transfer across reasoning types and up to 13\% absolute accuracy gain. Our results advance LLM-brain correspondences from correlation to guidance, establishing a brain-signal-driven pathway toward more robust and cognitively aligned AI.
翻译:大型语言模型(LLMs)与人类高阶认知背后神经机制之间的对应关系仍缺乏充分刻画。鉴于人脑中语言与推理能力具有可分离性,一个悬而未决的问题是:LLMs是否与推理相关脑区的神经信号对齐,以及此类信号能否优化模型表现?本研究聚焦演绎推理任务,揭示LLM内部表征不仅与任务态fMRI活动部分对齐,更可直接通过神经信号增强。通过神经预测性指标,我们发现在聚合层面,LLMs可解释推理相关脑区中相当比例的可解释方差,而特定推理类型内的预测性则较低,这提示了对齐与分化并存的现象。基于此,我们提出脑引导框架:沿模型与脑表征联合结构诱导的方向调控模型表征,在推理阶段施加干预并在训练阶段进行微调。实验表明,任务诱发的脑信号可直接增强LLM推理能力,其增益与纯语言监督正交——覆盖10个LLM(1.5B-72B参数规模),实现跨推理类型迁移,绝对准确率最高提升13%。本研究成果将LLM与大脑的对应关系从相关性提升至引导性,为构建更鲁棒且具认知一致性的AI开辟了脑信号驱动的新路径。