Real-world applications of machine learning models are often subject to legal or policy-based regulations. Some of these regulations require ensuring the validity of the model, i.e., the approximation error being smaller than a threshold. A global metric is generally too insensitive to determine the validity of a specific prediction, whereas evaluating local validity is costly since it requires gathering additional data.We propose learning the model error to acquire a local validity estimate while reducing the amount of required data through active learning. Using model validation benchmarks, we provide empirical evidence that the proposed method can lead to an error model with sufficient discriminative properties using a relatively small amount of data. Furthermore, an increased sensitivity to local changes of the validity bounds compared to alternative approaches is demonstrated.
翻译:机器学习模型在现实世界中的应用常受到法律或政策法规的约束。部分法规要求确保模型的有效性,即近似误差需小于给定阈值。全局评估指标通常对特定预测的有效性判断不够敏感,而评估局部有效性则因需要收集额外数据而成本高昂。本文提出通过学习模型误差来获取局部有效性估计,同时通过主动学习减少所需数据量。基于模型验证基准测试,我们通过实证表明:所提方法能够利用相对较少的数据获得具有充分判别能力的误差模型。此外,与现有方法相比,该方法对有效性边界局部变化表现出更高的敏感性。