Classical PAC generalization bounds on the prediction risk of a classifier are insufficient to provide theoretical guarantees on fairness when the goal is to learn models balancing predictive risk and fairness constraints. We propose a PAC-Bayesian framework for deriving generalization bounds for fairness, covering both stochastic and deterministic classifiers. For stochastic classifiers, we derive a fairness bound using standard PAC-Bayes techniques. Whereas for deterministic classifiers, as usual PAC-Bayes arguments do not apply directly, we leverage a recent advance in PAC-Bayes to extend the fairness bound beyond the stochastic setting. Our framework has two advantages: (i) It applies to a broad class of fairness measures that can be expressed as a risk discrepancy, and (ii) it leads to a self-bounding algorithm in which the learning procedure directly optimizes a trade-off between generalization bounds on the prediction risk and on the fairness. We empirically evaluate our framework with three classical fairness measures, demonstrating not only its usefulness but also the tightness of our bounds.
翻译:传统PAC泛化界在分类器预测风险上的理论保证,不足以在学习兼顾预测风险与公平性约束的模型时提供公平性的理论保障。我们提出了一个PAC贝叶斯框架,用于推导覆盖随机与确定性分类器的公平性泛化界。对于随机分类器,我们采用标准PAC贝叶斯技术推导出公平性界;而对于确定性分类器,由于常规PAC贝叶斯论证无法直接适用,我们借助PAC贝叶斯领域的最新进展将公平性界扩展至随机设定之外。本框架具有两大优势:(i) 适用于可通过风险差异表达的一大类公平性度量;(ii) 可推导出自边界算法,其学习过程直接优化预测风险泛化界与公平性泛化界之间的权衡。我们使用三种经典公平性度量对框架进行了实证评估,结果不仅证明了其有效性,也验证了所推导边界的紧致性。