The Pictorial Cortex: Zero-Shot Cross-Subject fMRI-to-Image Reconstruction via Compositional Latent Modeling

Decoding visual experiences from human brain activity remains a central challenge at the intersection of neuroscience, neuroimaging, and artificial intelligence. A critical obstacle is the inherent variability of cortical responses: neural activity elicited by the same visual stimulus differs across individuals and trials due to anatomical, functional, cognitive, and experimental factors, making fMRI-to-image reconstruction non-injective. In this paper, we tackle a challenging yet practically meaningful problem: zero-shot cross-subject fMRI-to-image reconstruction, where the visual experience of a previously unseen individual must be reconstructed without subject-specific training. To enable principled evaluation, we present a unified cortical-surface dataset -- UniCortex-fMRI, assembled from multiple visual-stimulus fMRI datasets to provide broad coverage of subjects and stimuli. Our UniCortex-fMRI is particularly processed by standardized data formats to make it possible to explore this possibility in the zero-shot scenario of cross-subject fMRI-to-image reconstruction. To tackle the modeling challenge, we propose PictorialCortex, which models fMRI activity using a compositional latent formulation that structures stimulus-driven representations under subject-, dataset-, and trial-related variability. PictorialCortex operates in a universal cortical latent space and implements this formulation through a latent factorization-composition module, reinforced by paired factorization and re-factorizing consistency regularization. During inference, surrogate latents synthesized under multiple seen-subject conditions are aggregated to guide diffusion-based image synthesis for unseen subjects. Extensive experiments show that PictorialCortex improves zero-shot cross-subject visual reconstruction, highlighting the benefits of compositional latent modeling and multi-dataset training.

翻译：从人类大脑活动中解码视觉体验，仍然是神经科学、神经影像学和人工智能交叉领域的核心挑战。一个关键障碍在于皮层响应的固有变异性：由于解剖、功能、认知及实验因素，同一视觉刺激引发的神经活动在不同个体和不同试次间存在差异，导致fMRI到图像的重建不具备单射性。本文致力于解决一个具有挑战性但具有实际意义的问题：零样本跨被试fMRI到图像重建，即必须在没有针对特定被试进行训练的情况下，重建先前未见个体的视觉体验。为了支持有原则的评估，我们提出了一个统一的皮层表面数据集——UniCortex-fMRI，该数据集整合了多个视觉刺激fMRI数据集，以广泛覆盖被试和刺激。我们的UniCortex-fMRI经过特别处理，采用标准化数据格式，使得在跨被试fMRI到图像重建的零样本场景下探索这种可能性成为可能。为了应对建模挑战，我们提出了PictorialCortex模型，该模型采用一种组合潜在公式对fMRI活动进行建模，该公式在受试者、数据集和试次相关变异性的约束下，构建刺激驱动的表征。PictorialCortex在一个通用的皮层潜在空间中运行，并通过一个潜在因子分解-组合模块实现该公式，该模块辅以配对因子分解和再因子化一致性正则化进行强化。在推理过程中，综合多个已见被试条件下合成的代理潜在变量，用以指导基于扩散模型的未见被试图像生成。大量实验表明，PictorialCortex提升了零样本跨被试视觉重建的性能，凸显了组合潜在建模和多数据集训练的优势。