Seven years ago, researchers proposed a postprocessing method to equalize the error rates of a model across different demographic groups. The work launched hundreds of papers purporting to improve over the postprocessing baseline. We empirically evaluate these claims through thousands of model evaluations on several tabular datasets. We find that the fairness-accuracy Pareto frontier achieved by postprocessing contains all other methods we were feasibly able to evaluate. In doing so, we address two common methodological errors that have confounded previous observations. One relates to the comparison of methods with different unconstrained base models. The other concerns methods achieving different levels of constraint relaxation. At the heart of our study is a simple idea we call unprocessing that roughly corresponds to the inverse of postprocessing. Unprocessing allows for a direct comparison of methods using different underlying models and levels of relaxation.
翻译:七年前,研究者提出了一种后处理方法,旨在使模型在不同人口群体间的错误率均等化。该工作催生了数百篇宣称超越后处理基线的论文。我们通过在多个表格数据集上进行数千次模型评估,实证检验了这些宣称。研究发现,后处理方法所实现的公平-准确率帕累托前沿,囊括了我们实际可评估的所有其他方法。在此过程中,我们纠正了两种导致先前观察结果混淆的常见方法论错误:其一涉及不同无约束基模型的方法比较,其二关乎达到不同约束松弛程度的方法。本研究的核心是一个被称为"解构"的简单概念,它大致对应后处理的逆过程。解构使得使用不同底层模型和松弛程度的方法能够进行直接比较。