Contrastive representation learning is crucial in medical time series analysis as it alleviates dependency on labor-intensive, domain-specific, and scarce expert annotations. However, existing contrastive learning methods primarily focus on one single data level, which fails to fully exploit the intricate nature of medical time series. To address this issue, we present COMET, an innovative hierarchical framework that leverages data consistencies at all inherent levels in medical time series. Our meticulously designed model systematically captures data consistency from four potential levels: observation, sample, trial, and patient levels. By developing contrastive loss at multiple levels, we can learn effective representations that preserve comprehensive data consistency, maximizing information utilization in a self-supervised manner. We conduct experiments in the challenging patient-independent setting. We compare COMET against six baselines using three diverse datasets, which include ECG signals for myocardial infarction and EEG signals for Alzheimer's and Parkinson's diseases. The results demonstrate that COMET consistently outperforms all baselines, particularly in setup with 10% and 1% labeled data fractions across all datasets. These results underscore the significant impact of our framework in advancing contrastive representation learning techniques for medical time series. The source code is available at https://github.com/DL4mHealth/COMET.
翻译:对比表示学习在医学时间序列分析中至关重要,因为它能减轻对劳动密集型、领域特定且稀缺的专家标注的依赖。然而,现有对比学习方法主要聚焦于单一数据层级,未能充分挖掘医学时间序列的复杂特性。为解决这一问题,我们提出COMET——一个创新的分层框架,能够利用医学时间序列中所有固有层级的数据一致性。我们精心设计的模型系统性地从观测、样本、试验和患者四个潜在层级捕获数据一致性。通过构建多层级对比损失,我们能够学习到保留全面数据一致性的有效表示,以自监督方式最大化信息利用率。我们在具有挑战性的患者独立设定下开展实验,使用三个多样化数据集(包括用于心肌梗死的ECG信号及用于阿尔茨海默病与帕金森病的EEG信号)将COMET与六种基线方法进行对比。结果表明,COMET在所有数据集上持续优于所有基线方法,尤其在标签数据占比为10%和1%的设置中表现突出。这些结果凸显了我们的框架在推动医学时间序列对比表示学习技术方面的重要影响。源代码已开源至https://github.com/DL4mHealth/COMET。