Despite their remarkable performance, deep neural networks remain unadopted in clinical practice, which is considered to be partially due to their lack in explainability. In this work, we apply attribution methods to a pre-trained deep neural network (DNN) for 12-lead electrocardiography classification to open this "black box" and understand the relationship between model prediction and learned features. We classify data from a public data set and the attribution methods assign a "relevance score" to each sample of the classified signals. This allows analyzing what the network learned during training, for which we propose quantitative methods: average relevance scores over a) classes, b) leads, and c) average beats. The analyses of relevance scores for atrial fibrillation (AF) and left bundle branch block (LBBB) compared to healthy controls show that their mean values a) increase with higher classification probability and correspond to false classifications when around zero, and b) correspond to clinical recommendations regarding which lead to consider. Furthermore, c) visible P-waves and concordant T-waves result in clearly negative relevance scores in AF and LBBB classification, respectively. In summary, our analysis suggests that the DNN learned features similar to cardiology textbook knowledge.
翻译:尽管深度神经网络表现出色,但其在临床实践中仍未被广泛采用,部分原因被认为是缺乏可解释性。本研究将归因方法应用于预训练的12导联心电图分类深度神经网络(DNN),以打开这一"黑箱",理解模型预测与学习特征之间的关系。我们对公开数据集中的数据进行分类,归因方法为分类信号的每个样本分配"相关性分数"。这使得分析网络在训练过程中学到的内容成为可能,为此我们提出了定量方法:a) 按类别、b) 按导联、c) 按平均心搏的平均相关性分数分析。对房颤(AF)和左束支传导阻滞(LBBB)与健康对照组的分析显示,其均值a) 随分类概率提高而增加,且接近零值时对应错误分类;b) 与临床推荐的关注导联相符。此外,c) 可见P波与一致性T波分别在AF和LBBB分类中导致明显的负相关性分数。总之,我们的分析表明该DNN学习到的特征与心脏病学教科书知识相似。