In safety-critical domains such as autonomous driving and medical diagnosis, the reliability of machine learning models is crucial. One significant challenge to reliability is concept drift, which can cause model deterioration over time. Traditionally, drift detectors rely on true labels, which are often scarce and costly. This study conducts a comprehensive empirical evaluation of using uncertainty values as substitutes for error rates in detecting drifts, aiming to alleviate the reliance on labeled post-deployment data. We examine five uncertainty estimation methods in conjunction with the ADWIN detector across seven real-world datasets. Our results reveal that while the SWAG method exhibits superior calibration, the overall accuracy in detecting drifts is not notably impacted by the choice of uncertainty estimation method, with even the most basic method demonstrating competitive performance. These findings offer valuable insights into the practical applicability of uncertainty-based drift detection in real-world, safety-critical applications.
翻译:在自动驾驶和医疗诊断等安全关键领域,机器学习模型的可靠性至关重要。影响可靠性的一个重大挑战是概念漂移,它可能导致模型性能随时间退化。传统上,漂移检测依赖于真实标签,而真实标签往往稀缺且获取成本高昂。本研究对使用不确定性值替代错误率进行漂移检测进行了全面的实证评估,旨在缓解对部署后标记数据的依赖。我们结合ADWIN检测器,在七个真实数据集上检验了五种不确定性估计方法。研究结果显示,尽管SWAG方法表现出更优的校准性能,但漂移检测的整体准确性并未因不确定性估计方法的选择而受到显著影响,即使是最基础的方法也展现出具有竞争力的性能。这些发现为在实际安全关键应用中基于不确定性的漂移检测的实用可行性提供了宝贵见解。