Predicting Fairness of ML Software Configuration

This paper investigates the relationships between hyperparameters of machine learning and fairness. Data-driven solutions are increasingly used in critical socio-technical applications where ensuring fairness is important. Rather than explicitly encoding decision logic via control and data structures, the ML developers provide input data, perform some pre-processing, choose ML algorithms, and tune hyperparameters (HPs) to infer a program that encodes the decision logic. Prior works report that the selection of HPs can significantly influence fairness. However, tuning HPs to find an ideal trade-off between accuracy, precision, and fairness has remained an expensive and tedious task. Can we predict fairness of HP configuration for a given dataset? Are the predictions robust to distribution shifts? We focus on group fairness notions and investigate the HP space of 5 training algorithms. We first find that tree regressors and XGBoots significantly outperformed deep neural networks and support vector machines in accurately predicting the fairness of HPs. When predicting the fairness of ML hyperparameters under temporal distribution shift, the tree regressors outperforms the other algorithms with reasonable accuracy. However, the precision depends on the ML training algorithm, dataset, and protected attributes. For example, the tree regressor model was robust for training data shift from 2014 to 2018 on logistic regression and discriminant analysis HPs with sex as the protected attribute; but not for race and other training algorithms. Our method provides a sound framework to efficiently perform fine-tuning of ML training algorithms and understand the relationships between HPs and fairness.

翻译：本文研究了机器学习超参数与公平性之间的关系。数据驱动的解决方案越来越多地应用于关键的社会技术应用中，确保公平性至关重要。机器学习开发者并非通过控制结构和数据结构显式编码决策逻辑，而是提供输入数据、进行预处理、选择机器学习算法并调整超参数，以推断出编码决策逻辑的程序。先前的研究报告指出，超参数的选择会显著影响公平性。然而，调整超参数以在准确性、精确度和公平性之间找到理想的平衡点，仍是一项昂贵且繁琐的任务。我们能否针对给定数据集预测超参数配置的公平性？这些预测对分布偏移是否稳健？我们关注群体公平性概念，并研究了5种训练算法的超参数空间。我们首先发现，在准确预测超参数公平性方面，树回归器和XGBoost显著优于深度神经网络和支持向量机。在时间分布偏移下预测机器学习超参数的公平性时，树回归器以合理的准确度优于其他算法。然而，其精确度取决于机器学习训练算法、数据集和保护属性。例如，当以性别为保护属性，针对逻辑回归和判别分析的超参数进行训练数据从2014年到2018年的偏移时，树回归模型表现稳健；但针对种族和其他训练算法则不稳健。我们的方法为高效微调机器学习训练算法、理解超参数与公平性之间的关系提供了合理的框架。