Worst-case fairness with off-the-shelf demographics achieves group parity by maximizing the model utility of the worst-off group. Nevertheless, demographic information is often unavailable in practical scenarios, which impedes the use of such a direct max-min formulation. Recent advances have reframed this learning problem by introducing the lower bound of minimal partition ratio, denoted as $\alpha$, as side information, referred to as ``$\alpha$-sized worst-case fairness'' in this paper. We first justify the practical significance of this setting by presenting noteworthy evidence from the data privacy perspective, which has been overlooked by existing research. Without imposing specific requirements on loss functions, we propose reweighting the training samples based on their intrinsic importance to fairness. Given the global nature of the worst-case formulation, we further develop a stochastic learning scheme to simplify the training process without compromising model performance. Additionally, we address the issue of outliers and provide a robust variant to handle potential outliers during model training. Our theoretical analysis and experimental observations reveal the connections between the proposed approaches and existing ``fairness-through-reweighting'' studies, with extensive experimental results on fairness benchmarks demonstrating the superiority of our methods.
翻译:基于现成人口统计信息的最坏情况公平性通过最大化最差群体模型效用来实现群体平等。然而,实际场景中人口统计信息往往不可得,这阻碍了此类直接最大最小化公式的应用。近期研究通过引入最小划分比例下界(记为α)作为辅助信息重构了该学习问题,本文称之为"α规模最坏情况公平性"。我们首先从数据隐私视角提出被现有研究忽视的重要证据,论证该设置的实际意义。在不强加损失函数特定要求的前提下,我们提出根据训练样本对公平性的内在重要性进行重加权。考虑到最坏情况公式的全局特性,我们进一步开发了随机学习方案以简化训练过程而不损害模型性能。此外,我们针对异常值问题提出了鲁棒变体来处理模型训练中潜在的异常样本。理论分析与实验观察揭示了所提方法与现有"通过重加权实现公平性"研究之间的关联,在公平性基准测试中的大量实验结果证明了我们方法的优越性。