NECOMIMI (NEural-COgnitive MultImodal EEG-Informed Image Generation with Diffusion Models) introduces a novel framework for generating images directly from EEG signals using advanced diffusion models. Unlike previous works that focused solely on EEG-image classification through contrastive learning, NECOMIMI extends this task to image generation. The proposed NERV EEG encoder demonstrates state-of-the-art (SoTA) performance across multiple zero-shot classification tasks, including 2-way, 4-way, and 200-way, and achieves top results in our newly proposed Category-based Assessment Table (CAT) Score, which evaluates the quality of EEG-generated images based on semantic concepts. A key discovery of this work is that the model tends to generate abstract or generalized images, such as landscapes, rather than specific objects, highlighting the inherent challenges of translating noisy and low-resolution EEG data into detailed visual outputs. Additionally, we introduce the CAT Score as a new metric tailored for EEG-to-image evaluation and establish a benchmark on the ThingsEEG dataset. This study underscores the potential of EEG-to-image generation while revealing the complexities and challenges that remain in bridging neural activity with visual representation.
翻译:NECOMIMI(基于扩散模型的神经认知多模态脑电图信息图像生成)提出了一种新颖的框架,利用先进的扩散模型直接从脑电图(EEG)信号生成图像。与以往仅通过对比学习关注EEG-图像分类的研究不同,NECOMIMI将此任务扩展至图像生成。所提出的NERV EEG编码器在多项零样本分类任务(包括2类、4类和200类分类)中展示了最先进的性能,并在我们新提出的基于类别的评估表(CAT)得分中取得了最佳结果,该得分依据语义概念评估EEG生成图像的质量。本研究的一个关键发现是,模型倾向于生成抽象或广义的图像(如风景),而非特定物体,这突显了将噪声大、分辨率低的EEG数据转化为详细视觉输出所固有的挑战。此外,我们引入了CAT得分作为专为EEG到图像评估定制的新指标,并在ThingsEEG数据集上建立了基准。本研究强调了EEG到图像生成的潜力,同时揭示了在连接神经活动与视觉表征方面仍存在的复杂性和挑战。