Prevention is better than cure. This old truth applies not only to the prevention of diseases but also to the prevention of issues with AI models used in medicine. The source of malfunctioning of predictive models often lies not in the training process but reaches the data acquisition phase or design of the experiment phase. In this paper, we analyze in detail a single use case - a Kaggle competition related to the detection of abnormalities in X-ray lung images. We demonstrate how a series of simple tests for data imbalance exposes faults in the data acquisition and annotation process. Complex models are able to learn such artifacts and it is difficult to remove this bias during or after the training. Errors made at the data collection stage make it difficult to validate the model correctly. Based on this use case, we show how to monitor data and model balance (fairness) throughout the life cycle of a predictive model, from data acquisition to parity analysis of model scores.
翻译:预防胜于治疗。这一古老真理不仅适用于疾病预防,也适用于预防医学中AI模型可能出现的问题。预测模型故障的根源往往不在训练过程,而可追溯至数据采集阶段或实验设计阶段。本文深入分析了一个具体用例——与肺部X光影像异常检测相关的Kaggle竞赛。我们通过一系列针对数据不平衡的简单测试,揭示了数据采集与标注过程中的缺陷。复杂模型能够学习此类伪影,且在训练过程中或训练后难以消除此类偏差。数据收集阶段产生的错误使得模型验证困难重重。基于该用例,我们展示了如何在整个预测模型生命周期中(从数据采集到模型评分的公平性分析)监控数据与模型的平衡性(公平性)。