This paper introduces the Efficient Decoupled Masked Autoencoder (EDMAE), a novel self-supervised method for recognizing standard views in pediatric echocardiography. EDMAE introduces a new proxy task based on the encoder-decoder structure. The EDMAE encoder is composed of a teacher and a student encoder. The teacher encoder extracts the potential representation of the masked image blocks, while the student encoder extracts the potential representation of the visible image blocks. The loss is calculated between the feature maps output by the two encoders to ensure consistency in the latent representations they extract. EDMAE uses pure convolution operations instead of the ViT structure in the MAE encoder. This improves training efficiency and convergence speed. EDMAE is pre-trained on a large-scale private dataset of pediatric echocardiography using self-supervised learning, and then fine-tuned for standard view recognition. The proposed method achieves high classification accuracy in 27 standard views of pediatric echocardiography. To further verify the effectiveness of the proposed method, the authors perform another downstream task of cardiac ultrasound segmentation on the public dataset CAMUS. The experimental results demonstrate that the proposed method outperforms some popular supervised and recent self-supervised methods, and is more competitive on different downstream tasks.
翻译:本文提出了高效解耦掩码自编码器(EDMAE),一种用于儿童超声心动图标准切面识别的新型自监督方法。EDMAE基于编码器-解码器结构引入了一种新的代理任务。其编码器由教师编码器与学生编码器组成:教师编码器提取掩码图像块的潜在表示,而学生编码器提取可见图像块的潜在表示。通过计算两个编码器输出的特征图之间的损失,确保它们所提取的潜在表示具有一致性。EDMAE在编码器中采用纯卷积操作替代MAE中的ViT结构,从而提升了训练效率与收敛速度。该方法在大规模儿童超声心动图私有数据集上通过自监督学习进行预训练,随后针对标准切面识别任务进行微调。所提方法在儿童超声心动图的27个标准切面中实现了高分类精度。为进一步验证方法的有效性,作者在公开数据集CAMUS上执行了心脏超声分割的下游任务。实验结果表明,该方法优于若干主流的监督学习方法及近年提出的自监督方法,并在不同下游任务中展现出更强的竞争力。