Machine Learning (ML) models have gained popularity in medical imaging analysis given their expert level performance in many medical domains. To enhance the trustworthiness, acceptance, and regulatory compliance of medical imaging models and to facilitate their integration into clinical settings, we review and categorise methods for ensuring ML reliability, both during development and throughout the model's lifespan. Specifically, we provide an overview of methods assessing models' inner-workings regarding bias encoding and detection of data drift for disease classification models. Additionally, to evaluate the severity in case of a significant drift, we provide an overview of the methods developed for classifier accuracy estimation in case of no access to ground truth labels. This should enable practitioners to implement methods ensuring reliable ML deployment and consistent prediction performance over time.
翻译:机器学习(ML)模型因其在众多医学领域展现出专家级性能,已在医学图像分析中得到广泛应用。为增强医学影像模型的可信度、接受度与法规遵从性,并促进其融入临床环境,本文系统回顾并归类了在模型开发阶段及全生命周期中确保机器学习可靠性的方法。具体而言,本文综述了针对疾病分类模型的内部工作机制评估方法,涵盖偏差编码与数据漂移检测。此外,为评估发生显著漂移时的严重程度,本文概述了在无法获取真实标签情况下用于分类器精度估计的方法。这将有助于实践者实施相关方法,确保机器学习部署的可靠性及预测性能随时间保持稳定。