Nearby neurons in cortex share similar response profiles, producing systematic spatial organization across sensory and cognitive systems. Recent topographic models reproduce aspects of this structure but remain unimodal and spatially constrain each layer separately, yielding fragmented maps that capture neither the contiguity of cortical processing streams nor their integration across modalities. We introduce Topo-Omni, a topographic multimodal model in which visual, auditory, and language/cognitive processing share a single contiguous in-silico sheet. Built by fine-tuning a pretrained foundation model with a spatial smoothness objective, this architecture develops clusters across modalities that are consistent with human neuroimaging, from sensory to cognitive systems. Driving or suppressing a cluster selectively biases or impairs perception, paralleling human intervention studies. Finally, we use our model to screen for novel clusters in-silico and discover new natural landscape and animal networks which we validate in human data. A single spatial principle thus organizes representations across modalities and processing stages, yielding testable hypotheses about cortical organization.
翻译:皮层中的邻近神经元具有相似的反应特征,从而在感觉和认知系统中产生系统的空间组织。现有的拓扑模型虽能再现这一结构的某些方面,但仅限于单模态,且每层独立施加空间约束,导致分割性图谱既无法体现皮层处理流的连续性,也无法捕捉跨模态的整合。我们提出Topo-Omni——一种拓扑多模态模型,其中视觉、听觉及语言/认知处理共享单一连续的计算机模拟皮层表面。该架构通过采用空间平滑目标微调预训练基础模型构建,发展出与人类神经影像数据一致(从感觉系统到认知系统)的跨模态聚类。选择性激活或抑制某个聚类会相应偏向或损伤感知能力,这与人类干预研究结果一致。最后,我们利用该模型在计算机中筛选新聚类,发现了新的自然景观与动物网络,并在人类数据中验证了其存在。因此,单一空间原则即可组织跨模态和处理阶段的表征,为皮层组织提供可检验的假设。