Recent research at CHU Sainte Justine's Pediatric Critical Care Unit (PICU) has revealed that traditional machine learning methods, such as semi-supervised label propagation and K-nearest neighbors, outperform Transformer-based models in artifact detection from PPG signals, mainly when data is limited. This study addresses the underutilization of abundant unlabeled data by employing self-supervised learning (SSL) to extract latent features from these data, followed by fine-tuning on labeled data. Our experiments demonstrate that SSL significantly enhances the Transformer model's ability to learn representations, improving its robustness in artifact classification tasks. Among various SSL techniques, including masking, contrastive learning, and DINO (self-distillation with no labels)-contrastive learning exhibited the most stable and superior performance in small PPG datasets. Further, we delve into optimizing contrastive loss functions, which are crucial for contrastive SSL. Inspired by InfoNCE, we introduce a novel contrastive loss function that facilitates smoother training and better convergence, thereby enhancing performance in artifact classification. In summary, this study establishes the efficacy of SSL in leveraging unlabeled data, particularly in enhancing the capabilities of the Transformer model. This approach holds promise for broader applications in PICU environments, where annotated data is often limited.
翻译:近年来,圣朱斯蒂娜医院儿童重症监护室的研究表明,在数据量有限的条件下,传统机器学习方法(如半监督标签传播与K近邻算法)在PPG信号伪影检测中的表现优于基于Transformer的模型。本研究针对海量无标注数据未被充分利用的问题,采用自监督学习从无标注数据中提取潜在特征,并在标注数据上进行微调。实验表明,自监督学习显著增强了Transformer模型学习表征的能力,提升了其在伪影分类任务中的鲁棒性。在多种自监督学习技术(包括掩码重建、对比学习及DINO)中,对比学习在小规模PPG数据集上展现出最稳定且优越的性能。进一步地,我们深入优化了对比学习损失函数——该函数对对比自监督学习至关重要。受InfoNCE启发,我们提出一种新型对比损失函数,可促进更平滑的训练过程与更优的收敛效果,从而提升伪影分类性能。综上,本研究证实了自监督学习在利用无标注数据方面的有效性,尤其增强了Transformer模型的能力。该方法在标注数据通常稀缺的PICU环境中具有广阔的应用前景。