Benchmarking Self-Supervised Models for Cardiac Ultrasound View Classification

Reliable interpretation of cardiac ultrasound images is essential for accurate clinical diagnosis and assessment. Self-supervised learning has shown promise in medical imaging by leveraging large unlabelled datasets to learn meaningful representations. In this study, we evaluate and compare two self-supervised learning frameworks, USF-MAE, developed by our team, and MoCo v3, on the recently introduced CACTUS dataset (37,736 images) for automated simulated cardiac view (A4C, PL, PSAV, PSMV, Random, and SC) classification. Both models used 5-fold cross-validation, enabling robust assessment of generalization performance across multiple random splits. The CACTUS dataset provides expert-annotated cardiac ultrasound images with diverse views. We adopt an identical training protocol for both models to ensure a fair comparison. Both models are configured with a learning rate of 0.0001 and a weight decay of 0.01. For each fold, we record performance metrics including ROC-AUC, accuracy, F1-score, and recall. Our results indicate that USF-MAE consistently outperforms MoCo v3 across metrics. The average testing AUC for USF-MAE is 99.99% (+/-0.01% 95% CI), compared to 99.97% (+/-0.01%) for MoCo v3. USF-MAE achieves a mean testing accuracy of 99.33% (+/-0.18%), higher than the 98.99% (+/-0.28%) reported for MoCo v3. Similar trends are observed for the F1-score and recall, with improvements statistically significant across folds (paired t-test, p=0.0048 < 0.01). This proof-of-concept analysis suggests that USF-MAE learns more discriminative features for cardiac view classification than MoCo v3 when applied to this dataset. The enhanced performance across multiple metrics highlights the potential of USF-MAE for improving automated cardiac ultrasound classification.

翻译：心脏超声图像的可靠解读对于准确的临床诊断与评估至关重要。自监督学习通过利用大量未标记数据学习有意义的表示，在医学影像领域展现出潜力。本研究在近期引入的CACTUS数据集（37,736张图像）上，针对自动模拟心脏切面（A4C、PL、PSAV、PSMV、Random和SC）分类任务，评估并比较了两个自监督学习框架：由本团队开发的USF-MAE与MoCo v3。两个模型均采用五折交叉验证，从而能够通过多次随机划分对泛化性能进行稳健评估。CACTUS数据集提供了专家标注的、包含多种切面的心脏超声图像。为确保公平比较，我们对两个模型采用完全相同的训练协议。两个模型均配置为学习率0.0001，权重衰减0.01。对于每一折，我们记录包括ROC-AUC、准确率、F1分数和召回率在内的性能指标。我们的结果表明，USF-MAE在所有指标上均持续优于MoCo v3。USF-MAE的平均测试AUC为99.99%（+/-0.01% 95%置信区间），而MoCo v3为99.97%（+/-0.01%）。USF-MAE的平均测试准确率达到99.33%（+/-0.18%），高于MoCo v3报告的98.99%（+/-0.28%）。F1分数和召回率也观察到相似的趋势，且各折间的改进具有统计学显著性（配对t检验，p=0.0048 < 0.01）。这项概念验证分析表明，在该数据集上，USF-MAE比MoCo v3学习了更具判别力的特征用于心脏切面分类。在多个指标上的性能提升凸显了USF-MAE在改进自动心脏超声分类方面的潜力。