Machine learning practitioners frequently observe tension between predictive accuracy and group fairness constraints -- yet sometimes fairness interventions appear to improve accuracy. We show that both phenomena can be artifacts of training data that misrepresents subgroup proportions. Under subpopulation shift (stable within-group distributions, shifted group proportions), we establish: (i) full importance-weighted correction is asymptotically unbiased but finite-sample suboptimal; (ii) the optimal finite-sample correction is a shrinkage reweighting that interpolates between target and training mixtures; (iii) apparent "fairness helps accuracy" can arise from comparing fairness methods to an improperly-weighted baseline. We provide an actionable evaluation protocol: fix representation (optimally) before fairness -- compare fairness interventions against a shrinkage-corrected baseline to isolate the true, irreducible price of fairness. Experiments on synthetic and real-world benchmarks (Adult, COMPAS) validate our theoretical predictions and demonstrate that this protocol eliminates spurious tradeoffs, revealing the genuine fairness-utility frontier.
翻译:机器学习从业者经常观察到预测准确性与群体公平性约束之间的张力——然而有时公平性干预似乎能提高准确性。我们证明这两种现象都可能是训练数据错误表示子群体比例的产物。在子群体偏移(组内分布稳定,组比例偏移)条件下,我们证明:(i)完全重要性加权校正在渐近意义下无偏但有限样本次优;(ii)最优有限样本校正是一种收缩重加权方法,在目标分布与训练混合分布之间进行插值;(iii)表面上的“公平性提升准确性”可能源于将公平性方法与不当加权的基线进行比较。我们提出可操作的评估方案:在公平性之前(最优地)修正表征——将公平性干预与经过收缩校正的基线进行比较,以分离出真实、不可约的公平性代价。在合成与真实基准数据集(Adult、COMPAS)上的实验验证了我们的理论预测,并证明该方案能消除虚假权衡,揭示真实的公平性-效用边界。