We present a sequential transfer learning framework for transformers on functional Magnetic Resonance Imaging (fMRI) data and demonstrate its significant benefits for decoding musical timbre. In the first of two phases, we pre-train our stacked-encoder transformer architecture on Next Thought Prediction, a self-supervised task of predicting whether or not one sequence of fMRI data follows another. This phase imparts a general understanding of the temporal and spatial dynamics of neural activity, and can be applied to any fMRI dataset. In the second phase, we fine-tune the pre-trained models and train additional fresh models on the supervised task of predicting whether or not two sequences of fMRI data were recorded while listening to the same musical timbre. The fine-tuned models achieve significantly higher accuracy with shorter training times than the fresh models, demonstrating the efficacy of our framework for facilitating transfer learning on fMRI data. Additionally, our fine-tuning task achieves a level of classification granularity beyond standard methods. This work contributes to the growing literature on transformer architectures for sequential transfer learning on fMRI data, and provides evidence that our framework is an improvement over current methods for decoding timbre.
翻译:我们提出了一种针对功能磁共振成像(fMRI)数据的Transformer序列迁移学习框架,并证明了其在解码音乐音色方面的显著优势。两阶段训练中,第一阶段采用堆叠编码器Transformer架构执行"下一思维预测"——即通过自监督任务判断两个fMRI数据序列是否存在先后关联。该阶段赋予模型对神经活动时空动力学的基础认知能力,且可适用于任意fMRI数据集。第二阶段对预训练模型进行微调,并额外训练全新模型完成监督任务:判断两个fMRI数据序列是否记录自同一音乐音色聆听过程。实验表明,微调模型相较全新模型,能以更短训练时间取得显著更高的准确率,验证了该框架促进fMRI数据迁移学习的有效性。此外,微调任务实现的分类粒度超越了传统方法。本研究为基于Transformer架构的fMRI数据序列迁移学习领域贡献了新成果,并证明该框架在音色解码方面优于现有方法。