Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer

Large vision-language models (LVLMs) have rapidly advanced across various domains, yet they still lag behind strong text-only large language models (LLMs) on tasks that require multi-step inference and compositional decision-making. Motivated by their shared transformer architectures, we investigate whether the two model families rely on common internal computation for such inference. At the neuron level, we uncover a surprisingly large overlap: more than half of the top-activated units during multi-step inference are shared between representative LLMs and LVLMs, revealing a modality-invariant inference subspace. Through causal probing via activation amplification, we further show that these shared neurons encode consistent and interpretable concept-level effects, demonstrating their functional contribution to inference. Building on this insight, we propose Shared Neuron Low-Rank Fusion (SNRF), a parameter-efficient framework that transfers mature inference circuitry from LLMs to LVLMs. SNRF profiles cross-model activations to identify shared neurons, computes a low-rank approximation of inter-model weight differences, and injects these updates selectively within the shared-neuron subspace. This mechanism strengthens multimodal inference performance with minimal parameter changes and requires no large-scale multimodal fine-tuning. Across diverse mathematics and perception benchmarks, SNRF consistently enhances LVLM inference performance while preserving perceptual capabilities. Our results demonstrate that shared neurons form an interpretable bridge between LLMs and LVLMs, enabling low-cost transfer of inference ability into multimodal models. Our code is available at [https://github.com/chenhangcuisg-code/Do-LLMs-VLMs-Share-Neurons](https://github.com/chenhangcuisg-code/Do-LLMs-VLMs-Share-Neurons).

翻译：大型视觉语言模型（LVLMs）已在多个领域取得快速进展，但在需要多步推理和组合决策的任务上仍落后于纯文本的大型语言模型（LLMs）。受其共享的Transformer架构启发，本研究探讨这两类模型是否依赖共同的内部计算进行此类推理。在神经元层面，我们发现了惊人的高度重叠：代表性LLMs与LVLMs在多步推理期间被高度激活的单元中，超过半数存在共享现象，揭示了一个模态不变的推理子空间。通过基于激活放大的因果探测，我们进一步证明这些共享神经元编码了一致且可解释的概念级效应，彰显了其对推理的功能性贡献。基于此发现，我们提出了共享神经元低秩融合（SNRF）框架——一种参数高效的迁移方法，可将成熟的推理机制从LLMs迁移至LVLMs。SNRF通过分析跨模型激活来识别共享神经元，计算模型间权重差异的低秩近似，并选择性地在共享神经元子空间内注入这些更新。该机制能以极少的参数改动显著增强多模态推理性能，且无需大规模多模态微调。在多样化的数学与感知基准测试中，SNRF持续提升LVLM的推理性能，同时保持其感知能力。我们的研究结果表明，共享神经元构成了LLMs与LVLMs之间可解释的桥梁，为实现低成本的多模态模型推理能力迁移提供了可能。代码已开源：[https://github.com/chenhangcuisg-code/Do-LLMs-VLMs-Share-Neurons](https://github.com/chenhangcuisg-code/Do-LLMs-VLMs-Share-Neurons)。