Researchers have proposed many methods for fair and robust machine learning, but comprehensive empirical evaluation of their subgroup robustness is lacking. In this work, we address this gap in the context of tabular data, where sensitive subgroups are clearly-defined, real-world fairness problems abound, and prior works often do not compare to state-of-the-art tree-based models as baselines. We conduct an empirical comparison of several previously-proposed methods for fair and robust learning alongside state-of-the-art tree-based methods and other baselines. Via experiments with more than $340{,}000$ model configurations on eight datasets, we show that tree-based methods have strong subgroup robustness, even when compared to robustness- and fairness-enhancing methods. Moreover, the best tree-based models tend to show good performance over a range of metrics, while robust or group-fair models can show brittleness, with significant performance differences across different metrics for a fixed model. We also demonstrate that tree-based models show less sensitivity to hyperparameter configurations, and are less costly to train. Our work suggests that tree-based ensemble models make an effective baseline for tabular data, and are a sensible default when subgroup robustness is desired. For associated code and detailed results, see https://github.com/jpgard/subgroup-robustness-grows-on-trees .
翻译:研究者提出了许多公平且鲁棒的机器学习方法,但对其子群鲁棒性的全面实证评估仍较为缺乏。本研究针对表格数据填补了这一空白,在此类数据中敏感子群定义明确、现实公平性问题普遍存在,且以往工作常未将最先进的基于树的模型作为基线进行比较。我们通过实证比较了多种先前提出的公平鲁棒学习方法,以及最先进的基于树的方法和其他基线。通过超过34万个模型配置在八个数据集上的实验,我们证明基于树的方法具有强大的子群鲁棒性,即便与针对鲁棒性和公平性的增强方法相比也是如此。此外,最佳树模型往往在多种指标上表现良好,而鲁棒模型或群组公平模型可能表现出脆弱性,对于固定模型,其在不同指标上的性能存在显著差异。我们还证明,基于树的模型对超参数配置的敏感性较低,且训练成本更小。我们的研究表明,基于树的集成模型可作为表格数据的有效基线,并且当需要子群鲁棒性时,它们是一个合理默认选择。相关代码和详细结果详见 https://github.com/jpgard/subgroup-robustness-grows-on-trees 。