A Temporal-Spectral Fusion Transformer with Subject-Specific Adapter for Enhancing RSVP-BCI Decoding

The Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interface (BCI) is an efficient technology for target retrieval using electroencephalography (EEG) signals. The performance improvement of traditional decoding methods relies on a substantial amount of training data from new test subjects, which increases preparation time for BCI systems. Several studies introduce data from existing subjects to reduce the dependence of performance improvement on data from new subjects, but their optimization strategy based on adversarial learning with extensive data increases training time during the preparation procedure. Moreover, most previous methods only focus on the single-view information of EEG signals, but ignore the information from other views which may further improve performance. To enhance decoding performance while reducing preparation time, we propose a Temporal-Spectral fusion transformer with Subject-specific Adapter (TSformer-SA). Specifically, a cross-view interaction module is proposed to facilitate information transfer and extract common representations across two-view features extracted from EEG temporal signals and spectrogram images. Then, an attention-based fusion module fuses the features of two views to obtain comprehensive discriminative features for classification. Furthermore, a multi-view consistency loss is proposed to maximize the feature similarity between two views of the same EEG signal. Finally, we propose a subject-specific adapter to rapidly transfer the knowledge of the model trained on data from existing subjects to decode data from new subjects. Experimental results show that TSformer-SA significantly outperforms comparison methods and achieves outstanding performance with limited training data from new subjects. This facilitates efficient decoding and rapid deployment of BCI systems in practical use.

翻译：基于快速序列视觉呈现（RSVP）的脑机接口（BCI）是一种利用脑电图（EEG）信号进行目标检索的高效技术。传统解码方法的性能提升依赖于来自新测试被试的大量训练数据，这增加了BCI系统的准备时间。已有研究引入已有被试的数据以减少性能提升对新被试数据的依赖，但其基于对抗学习与大量数据的优化策略增加了准备过程中的训练时间。此外，先前多数方法仅关注EEG信号的单一视图信息，而忽略了其他视图可能进一步提升性能的信息。为在提升解码性能的同时减少准备时间，我们提出了一种集成时序-频谱融合Transformer与特定被试适配器（TSformer-SA）的方法。具体而言，我们设计了一个跨视图交互模块，以促进从EEG时序信号和频谱图图像中提取的双视图特征之间的信息传递，并提取跨视图的公共表征。随后，一个基于注意力的融合模块将双视图特征融合，以获得用于分类的全面判别性特征。此外，我们提出了一种多视图一致性损失，以最大化同一EEG信号双视图之间的特征相似性。最后，我们设计了一个特定被试适配器，用于快速迁移基于已有被试数据训练的模型知识，以解码新被试的数据。实验结果表明，TSformer-SA显著优于对比方法，并在新被试训练数据有限的情况下取得了优异的性能。这有助于在实际应用中实现BCI系统的高效解码与快速部署。