We study the problem of auditing classifiers with the notion of statistical subgroup fairness. Kearns et al. (2018) has shown that the problem of auditing combinatorial subgroups fairness is as hard as agnostic learning. Essentially all work on remedying statistical measures of discrimination against subgroups assumes access to an oracle for this problem, despite the fact that no efficient algorithms are known for it. If we assume the data distribution is Gaussian, or even merely log-concave, then a recent line of work has discovered efficient agnostic learning algorithms for halfspaces. Unfortunately, the reduction of Kearns et al. was formulated in terms of weak, "distribution-free" learning, and thus did not establish a connection for families such as log-concave distributions. In this work, we give positive and negative results on auditing for Gaussian distributions: On the positive side, we present an alternative approach to leverage these advances in agnostic learning and thereby obtain the first polynomial-time approximation scheme (PTAS) for auditing nontrivial combinatorial subgroup fairness: we show how to audit statistical notions of fairness over homogeneous halfspace subgroups when the features are Gaussian. On the negative side, we find that under cryptographic assumptions, no polynomial-time algorithm can guarantee any nontrivial auditing, even under Gaussian feature distributions, for general halfspace subgroups.
翻译:我们研究了使用统计子群公平性概念审计分类器的问题。Kearns等人(2018)已证明,审计组合子群公平性的问题与不可知学习同样困难。几乎所有针对子群歧视的统计衡量标准进行补救的工作都假设可以访问解决该问题的预言机,尽管已知不存在高效算法。如果假设数据分布是高斯分布,甚至仅仅是对数凹分布,那么最近一系列工作发现了针对半空间的不可知学习高效算法。不幸的是,Kearns等人的约简是在弱"分布无关"学习的框架下制定的,因此未能为对数凹分布等分布族建立联系。在本工作中,我们给出了高斯分布审计的正反两方面结果:在正面结果中,我们提出了一种替代方法来利用这些不可知学习的进展,从而首次获得用于审计非平凡组合子群公平性的多项式时间近似方案(PTAS):我们展示了如何在特征服从高斯分布时,对同质半空间子群进行统计公平性概念的审计。在反面结果中,我们发现基于密码学假设,即使特征服从高斯分布,对于一般半空间子群,也不存在任何多项式时间算法能保证实现非平凡的审计。