Existing EEG-driven image reconstruction methods often overlook spatial attention mechanisms, limiting fidelity and semantic coherence. To address this, we propose a dual-conditioning framework that combines EEG embeddings with spatial saliency maps to enhance image generation. Our approach leverages the Adaptive Thinking Mapper (ATM) for EEG feature extraction and fine-tunes Stable Diffusion 2.1 via Low-Rank Adaptation (LoRA) to align neural signals with visual semantics, while a ControlNet branch conditions generation on saliency maps for spatial control. Evaluated on THINGS-EEG, our method achieves a significant improvement in the quality of low- and high-level image features over existing approaches. Simultaneously, strongly aligning with human visual attention. The results demonstrate that attentional priors resolve EEG ambiguities, enabling high-fidelity reconstructions with applications in medical diagnostics and neuroadaptive interfaces, advancing neural decoding through efficient adaptation of pre-trained diffusion models.
翻译:现有的脑电图(EEG)驱动图像重建方法往往忽视空间注意力机制,限制了重建结果的保真度和语义连贯性。为解决此问题,我们提出了一种双条件框架,将EEG嵌入与空间显著性图相结合以增强图像生成。我们的方法利用自适应思维映射器(ATM)进行EEG特征提取,并通过低秩自适应(LoRA)微调Stable Diffusion 2.1,以对齐神经信号与视觉语义,同时使用ControlNet分支基于显著性图对生成过程进行空间控制。在THINGS-EEG数据集上的评估表明,我们的方法在低层与高层图像特征质量上较现有方法有显著提升,同时与人类视觉注意力高度对齐。结果表明,注意力先验能够解决EEG信号的模糊性,实现高保真度重建,可应用于医学诊断和神经自适应接口,通过高效适配预训练扩散模型推动了神经解码技术的发展。