Structure-Guided Diffusion Model for EEG-Based Visual Cognition Reconstruction

Objective: Decoding visual information from electroencephalography (EEG) is an important problem in neuroscience and brain-computer interface (BCI) research. Existing methods are largely restricted to natural images and categorical representations, with limited capacity to capture structural features and to differentiate objective perception from subjective cognition. We propose a Structure-Guided Diffusion Model (SGDM) that incorporates explicit structural information for EEG-based visual reconstruction. Approach: SGDM is evaluated on the Kilogram abstract visual object dataset and the THINGS natural image dataset using a two-stage generative mechanism. The framework combines a structurally supervised variational autoencoder with a spatiotemporal EEG encoder aligned to a visual embedding space via contrastive learning. Structural information is integrated into a diffusion model through ControlNet to guide image generation from EEG features. Results: SGDM outperforms existing methods on both abstract and natural image datasets. Reconstructed images achieve higher fidelity in low-level visual features and semantic representations, indicating improved decoding accuracy and strong generalization across diverse visual domains. Spatiotemporal analysis of EEG signals further reveals hierarchical structural encoding patterns, consistent with the neural dynamics of visual cognition. Significance: These findings validate the effectiveness of SGDM in capturing explicit structural geometry and generating images with high fidelity to individual cognitive representations. By enabling decoding of complex visual content from EEG signals, the framework extends neural decoding beyond low-dimensional or categorical outputs. This supports BCIs with increased degrees of freedom for intention decoding and more flexible brain-to-machine communication.

翻译：摘要：目标：从脑电图解码视觉信息是神经科学与脑机接口研究中的重要问题。现有方法大多局限于自然图像与类别表征，在捕捉结构特征以及区分客观感知与主观认知方面能力有限。为此，我们提出一种融合显式结构信息、用于脑电视觉重建的结构引导扩散模型。方法：基于公斤级抽象视觉物体数据集与THINGS自然图像数据集，采用两阶段生成机制对SGDM进行评估。该框架结合了结构监督变分自编码器与经对比学习对齐至视觉嵌入空间的时空脑电编码器。通过ControlNet将结构信息集成至扩散模型，从而引导基于脑电特征生成图像。结果：在抽象与自然图像数据集上，SGDM均优于现有方法。重建图像在低级视觉特征与语义表征方面呈现更高保真度，表明解码准确性提升且在不同视觉域间具备强泛化能力。脑电信号的时空分析进一步揭示出层次化结构编码模式，与视觉认知的神经动力学特征一致。意义：这些发现验证了SGDM在捕捉显式结构几何特征、生成高保真度个体认知表征图像方面的有效性。通过实现从脑电信号解码复杂视觉内容，该框架将神经解码拓展至低维或类别化输出之外。这为具有更高自由度意图解码能力及更灵活脑机通信的脑机接口提供了支撑。