Machine learning approaches for image classification have led to impressive advances in that field. For example, convolutional neural networks are able to achieve remarkable image classification accuracy across a wide range of applications in industry, defense, and other areas. While these machine learning models boast impressive accuracy, a related concern is how to assess and maintain calibration in the predictions these models make. A classification model is said to be well calibrated if its predicted probabilities correspond with the rates events actually occur. While there are many available methods to assess machine learning calibration and recalibrate faulty predictions, less effort has been spent on developing approaches that continually monitor predictive models for potential loss of calibration as time passes. We propose a cumulative sum-based approach with dynamic limits that enable detection of miscalibration in both traditional process monitoring and concept drift applications. This enables early detection of operational context changes that impact image classification performance in the field. The proposed chart can be used broadly in any situation where the user needs to monitor probability predictions over time for potential lapses in calibration. Importantly, our method operates on probability predictions and event outcomes and does not require under-the-hood access to the machine learning model.
翻译:图像分类领域的机器学习方法已推动该领域取得显著进展。例如,卷积神经网络能够在工业、国防及其他领域的广泛应用中实现卓越的图像分类精度。尽管这些机器学习模型展现出令人瞩目的准确性,一个相关关切是如何评估并维持这些模型预测的校准性。若分类模型的预测概率与实际事件发生频率相符,则称其具备良好校准性。尽管已有多种方法可用于评估机器学习校准性并重新校准错误预测,但针对随时间推移持续监测预测模型可能出现的校准失效问题,相关方法的研究尚显不足。本文提出一种基于累积和的方法,其动态控制限能够同时适用于传统过程监控与概念漂移应用中的失校检测。该方法可实现对影响现场图像分类性能的操作环境变化的早期探测。所提出的控制图可广泛应用于需要长期监测概率预测以发现潜在校准失效的任何场景。值得注意的是,本方法仅基于概率预测与事件结果运行,无需直接访问机器学习模型内部结构。