In this work, we delve into the EEG classification task in the domain of visual brain decoding via two frameworks, involving two different learning paradigms. Considering the spatio-temporal nature of EEG data, one of our frameworks is based on a CNN-BiLSTM model. The other involves a CNN-Transformer architecture which inherently involves the more versatile attention based learning paradigm. In both cases, a special 1D-CNN feature extraction module is used to generate the initial embeddings with 1D convolutions in the time and the EEG channel domains. Considering the EEG signals are noisy, non stationary and the discriminative features are even less clear (than in semantically structured data such as text or image), we also follow a window-based classification followed by majority voting during inference, to yield labels at a signal level. To illustrate how brain patterns correlate with different image classes, we visualize t-SNE plots of the BiLSTM embeddings alongside brain activation maps for the top 10 classes. These visualizations provide insightful revelations into the distinct neural signatures associated with each visual category, showcasing the BiLSTM's capability to capture and represent the discriminative brain activity linked to visual stimuli. We demonstrate the performance of our approach on the updated EEG-Imagenet dataset with positive comparisons with state-of-the-art methods.
翻译:本文通过两种框架深入探索视觉脑解码领域的脑电图分类任务,涉及两种不同的学习范式。考虑到脑电数据的时空特性,其中一个框架基于CNN-BiLSTM模型,另一个则采用CNN-Transformer架构,该架构天然包含更通用的基于注意力的学习范式。两种方案均使用特殊的1D-CNN特征提取模块,通过时间域和脑电通道域的1D卷积生成初始嵌入。鉴于脑电信号具有噪声强、非平稳性以及判别特征(相较于文本或图像等语义结构化数据)更加模糊的特点,我们在推理阶段采用基于窗口的分类与多数投票相结合的策略,最终输出信号级别的标签。为揭示不同图像类别与脑模式之间的关联,我们可视化了BiLSTM嵌入的t-SNE图及前10类对应的脑激活图谱。这些可视化结果深入揭示了每个视觉类别独特的神经特征,展示了BiLSTM捕捉并表征视觉刺激相关判别性脑活动的能力。我们在更新的EEG-ImageNet数据集上验证了该方法的表现,并与现有最优方法进行了积极对比。