Electrocardiograms (ECG) are electrical recordings of the heart that are critical for diagnosing cardiovascular conditions. ECG language models (ELMs) have recently emerged as a promising framework for ECG classification accompanied by report generation. However, current models cannot forecast future cardiac events despite the immense clinical value for planning earlier intervention. To address this gap, we propose CAMEL, the first ELM that is capable of inference over longer signal durations which enables its forecasting capability. Our key insight is a specialized ECG encoder which enables cross-understanding of ECG signals with text. We train CAMEL using established LLM training procedures, combining LoRA adaptation with a curriculum learning pipeline. Our curriculum includes ECG classification, metrics calculations, and multi-turn conversations to elicit reasoning. CAMEL demonstrates strong zero-shot performance across 6 tasks and 9 datasets, including ECGForecastBench, a new benchmark that we introduce for forecasting arrhythmias. CAMEL is on par with or surpasses ELMs and fully supervised baselines both in- and out-of-distribution, achieving SOTA results on ECGBench (+7.0% absolute average gain) as well as ECGForecastBench (+12.4% over fully supervised models and +21.1% over zero-shot ELMs).
翻译:心电图(ECG)是心脏的电生理记录,对于诊断心血管疾病至关重要。心电图语言模型(ELM)最近已成为一种有前景的框架,能够同时进行心电图分类和报告生成。然而,尽管预测未来心脏事件对于规划早期干预具有巨大的临床价值,现有模型尚不具备此能力。为填补这一空白,我们提出了CAMEL,这是首个能够对更长信号时长进行推理从而实现预测能力的ELM。我们的核心洞见在于一个专门的心电图编码器,它实现了心电图信号与文本的跨模态理解。我们采用成熟的LLM训练流程训练CAMEL,结合LoRA适配与课程学习框架。我们的课程包括心电图分类、指标计算以及用于激发推理能力的多轮对话。CAMEL在6个任务和9个数据集上展现出强大的零样本性能,其中包括我们为预测心律失常而引入的新基准ECGForecastBench。CAMEL在分布内和分布外场景下均达到或超越了现有ELM及全监督基线模型,在ECGBench上取得了SOTA结果(绝对平均增益提升+7.0%),并在ECGForecastBench上显著优于全监督模型(+12.4%)和零样本ELM(+21.1%)。