ECG-Language Models (ELMs) extend recent advances in Multimodal Large Language Models (MLLMs) to automated ECG interpretation. However, most existing ELMs inherit Vision-Language Model (VLM) design choices and rely on pretrained ECG encoders, introducing substantial architectural and training complexity. Inspired by encoder-free VLMs, we introduce ELF, a family of three encoder-free ELM architectures that remain competitive with, and often outperform, prior state-of-the-art ELMs across two datasets despite substantially simpler architectures and training pipelines. All code and data is available at github.com/ELM-Research/ECG-Language-Models.
翻译:心电图-语言模型(ELM)将多模态大型语言模型(MLLM)的最新进展扩展至自动心电图解读。然而,现有大多数ELM继承了视觉-语言模型(VLM)的设计范式,并依赖预训练的心电图编码器,这显著增加了其架构与训练的复杂性。受无编码器VLM的启发,我们提出ELF——一个包含三种无编码器ELM架构的模型家族。尽管其架构及训练流程更为简化,但ELF在两个数据集上的性能仍能与先前最先进的ELM相媲美,且往往超越后者。所有代码与数据已在github.com/ELM-Research/ECG-Language-Models开源。