Labeling of multivariate biomedical time series data is a laborious and expensive process. Self-supervised contrastive learning alleviates the need for large, labeled datasets through pretraining on unlabeled data. However, for multivariate time series data, the set of input channels often varies between applications, and most existing work does not allow for transfer between datasets with different sets of input channels. We propose learning one encoder to operate on all input channels individually. We then use a message passing neural network to extract a single representation across channels. We demonstrate the potential of this method by pretraining our model on a dataset with six EEG channels and then fine-tuning it on a dataset with two different EEG channels. We compare models with and without the message passing neural network across different contrastive loss functions. We show that our method, combined with the TS2Vec loss, outperforms all other methods in most settings.
翻译:多变量生物医学时间序列数据的标注是一个费时且昂贵的过程。自监督对比学习通过在无标注数据上进行预训练,减轻了对大规模标注数据集的需求。然而,针对多变量时间序列数据,输入通道集在不同应用中往往不同,现有大部分工作无法在不同输入通道集的数据集之间进行迁移。我们提出学习一个编码器,使其独立处理所有输入通道,然后使用消息传递神经网络跨通道提取单一表征。我们通过在具有六个EEG通道的数据集上预训练模型,再在具有两个不同EEG通道的数据集上进行微调,展示了该方法的潜力。我们对比了使用与不使用消息传递神经网络的模型在不同对比损失函数下的表现。结果表明,结合TS2Vec损失的方法在大多数设置下优于其他所有方法。