Machine learning models are essential tools in various domains, but their performance can degrade over time due to changes in data distribution or other factors. On one hand, detecting and addressing such degradations is crucial for maintaining the models' reliability. On the other hand, given enough data, any arbitrary small change of quality can be detected. As interventions, such as model re-training or replacement, can be expensive, we argue that they should only be carried out when changes exceed a given threshold. We propose a sequential monitoring scheme to detect these relevant changes. The proposed method reduces unnecessary alerts and overcomes the multiple testing problem by accounting for temporal dependence of the measured model quality. Conditions for consistency and specified asymptotic levels are provided. Empirical validation using simulated and real data demonstrates the superiority of our approach in detecting relevant changes in model quality compared to benchmark methods. Our research contributes a practical solution for distinguishing between minor fluctuations and meaningful degradations in machine learning model performance, ensuring their reliability in dynamic environments.
翻译:机器学习模型是各领域的重要工具,但其性能可能因数据分布变化或其他因素而随时间退化。一方面,检测并应对此类退化对于维护模型的可靠性至关重要。另一方面,在足够数据支持下,任何微小的质量变化都可能被检测到。由于模型重训练或替换等干预措施成本高昂,我们认为只有当变化超过特定阈值时才应执行这些操作。我们提出了一种序贯监控方案来检测这些相关变化。该方法通过考虑模型质量度量指标的时间依赖性,减少了不必要的警报,并克服了多重比较问题。本文给出了该方法的一致性和指定渐近水平的条件。基于模拟数据和真实数据的实证验证表明,与基准方法相比,我们的方法在检测模型质量的相关变化方面具有优越性。本研究为区分机器学习模型性能的微小波动与有意义退化提供了实用解决方案,从而确保模型在动态环境中的可靠性。