We provide practical, efficient, and nonparametric methods for auditing the fairness of deployed classification and regression models. Whereas previous work relies on a fixed-sample size, our methods are sequential and allow for the continuous monitoring of incoming data, making them highly amenable to tracking the fairness of real-world systems. We also allow the data to be collected by a probabilistic policy as opposed to sampled uniformly from the population. This enables auditing to be conducted on data gathered for another purpose. Moreover, this policy may change over time and different policies may be used on different subpopulations. Finally, our methods can handle distribution shift resulting from either changes to the model or changes in the underlying population. Our approach is based on recent progress in anytime-valid inference and game-theoretic statistics-the "testing by betting" framework in particular. These connections ensure that our methods are interpretable, fast, and easy to implement. We demonstrate the efficacy of our methods on several benchmark fairness datasets.
翻译:我们为审计已部署分类与回归模型的公平性提供了实用、高效的非参数方法。不同于先前研究依赖固定样本量的方法,我们的方法具有序列性,支持对实时流入数据的持续监测,从而高度适用于追踪真实世界系统的公平性。此外,我们允许通过概率策略而非均匀抽样方式从人群中收集数据,这使得审计可基于为其他目的而收集的数据进行。该策略可随时间动态调整,且不同子群体可采用不同策略。最终,我们的方法能够处理由模型变更或底层群体变化引发的分布漂移。本方法基于随时有效推断与博弈论统计领域的最新进展——特别是“通过博彩进行检验”框架。这些理论关联确保了方法的可解释性、快速性与易实现性。我们在多个基准公平性数据集上验证了方法的有效性。