Machine learning techniques are effective for building predictive models because they identify patterns in large datasets. Development of a model for complex real-life problems often stop at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as patient demographics, systems and clinical practices change. The maintenance and monitoring of predictive model performance post-publication is crucial to enable their safe and effective long-term use. We will assess the infrastructure required to monitor the outputs of a machine learning algorithm, and present two scenarios with examples of monitoring and updates of models, firstly on a breast cancer prognosis model trained on public longitudinal data, and secondly on a neurodegenerative stratification algorithm that is currently being developed and tested in clinic.
翻译:机器学习技术因能从大规模数据集中识别模式,从而有效构建预测模型。针对复杂现实问题的模型开发往往止步于成果发表、概念验证或通过某种部署方式实现可访问性。然而,随着患者人口统计学特征、临床系统及诊疗实践的动态变化,医疗领域的模型存在过时风险。为确保模型长期安全有效使用,模型在发表后的维护与性能监控至关重要。本文将评估机器学习算法输出监控所需的基础设施,并通过两个场景呈现模型监控与更新的案例:其一是基于公开纵向数据训练的乳腺癌预后模型,其二是目前正在临床开发与测试的神经退行性疾病分层算法。