The performance of ML models degrades when the training population is different from that seen under operation. Towards assessing distributional robustness, we study the worst-case performance of a model over all subpopulations of a given size, defined with respect to core attributes Z. This notion of robustness can consider arbitrary (continuous) attributes Z, and automatically accounts for complex intersectionality in disadvantaged groups. We develop a scalable yet principled two-stage estimation procedure that can evaluate the robustness of state-of-the-art models. We prove that our procedure enjoys several finite-sample convergence guarantees, including dimension-free convergence. Instead of overly conservative notions based on Rademacher complexities, our evaluation error depends on the dimension of Z only through the out-of-sample error in estimating the performance conditional on Z. On real datasets, we demonstrate that our method certifies the robustness of a model and prevents deployment of unreliable models.
翻译:当训练群体与运行环境中的群体存在差异时,机器学习模型的性能会出现退化。为评估分布鲁棒性,我们研究了模型在给定规模的所有子群体(基于核心属性Z定义)上的最差性能表现。这种鲁棒性概念可以考虑任意(连续)属性Z,并自动考虑弱势群体中复杂的交叉性因素。我们开发了一种可扩展且具有理论依据的两阶段估计方法,能够评估最先进模型的鲁棒性。我们证明了该方法具有多种有限样本收敛保证,包括与维度无关的收敛性。与基于Rademacher复杂度的过度保守概念不同,我们的评估误差对Z维度的依赖性仅体现在基于Z的条件性能估计的样本外误差上。在真实数据集上,我们验证了该方法能够有效认证模型的鲁棒性,并阻止不可靠模型的部署。