Despite their immense success in numerous fields, machine and deep learning systems have not yet been able to firmly establish themselves in mission-critical applications in healthcare. One of the main reasons lies in the fact that when models are presented with previously unseen, Out-of-Distribution samples, their performance deteriorates significantly. This is known as the Domain Generalization (DG) problem. Our objective in this work is to propose a benchmark for evaluating DG algorithms, in addition to introducing a novel architecture for tackling DG in biosignal classification. In this paper, we describe the Domain Generalization problem for biosignals, focusing on electrocardiograms (ECG) and electroencephalograms (EEG) and propose and implement an open-source biosignal DG evaluation benchmark. Furthermore, we adapt state-of-the-art DG algorithms from computer vision to the problem of 1D biosignal classification and evaluate their effectiveness. Finally, we also introduce a novel neural network architecture that leverages multi-layer representations for improved model generalizability. By implementing the above DG setup we are able to experimentally demonstrate the presence of the DG problem in ECG and EEG datasets. In addition, our proposed model demonstrates improved effectiveness compared to the baseline algorithms, exceeding the state-of-the-art in both datasets. Recognizing the significance of the distribution shift present in biosignal datasets, the presented benchmark aims at urging further research into the field of biomedical DG by simplifying the evaluation process of proposed algorithms. To our knowledge, this is the first attempt at developing an open-source framework for evaluating ECG and EEG DG algorithms.
翻译:尽管机器学习和深度学习系统在众多领域取得了巨大成功,但它们尚未能在医疗领域的关键应用中牢固确立自身地位。主要原因之一在于,当模型面对先前未见过的分布外样本时,其性能会显著下降,这被称为域泛化问题。本文旨在提出一个评估域泛化算法的基准,同时介绍一种用于解决生物信号分类中域泛化问题的新型架构。我们描述了面向生物信号的域泛化问题,重点聚焦于心电图和脑电图,并提出并实现了一个开源的生物信号域泛化评估基准。此外,我们将计算机视觉领域的先进域泛化算法适配到一维生物信号分类问题中,并评估其有效性。最后,我们引入了一种利用多层表示提升模型泛化能力的新型神经网络架构。通过实施上述域泛化设置,我们实验证明了心电和脑电数据集中域泛化问题的存在。同时,与基线算法相比,我们提出的模型展现了更优的效果,在两个数据集上均超越了现有技术水平。认识到生物信号数据集中分布偏移的重要性,本文提出的基准旨在通过简化所提算法的评估过程,推动生物医学域泛化领域的进一步研究。据我们所知,这是首次尝试开发用于评估心电和脑电域泛化算法的开源框架。