Supervised learning systems are trained using historical data and, if the data was tainted by discrimination, they may unintentionally learn to discriminate against protected groups. We propose that fair learning methods, despite training on potentially discriminatory datasets, shall perform well on fair test datasets. Such dataset shifts crystallize application scenarios for specific fair learning methods. For instance, the removal of direct discrimination can be represented as a particular dataset shift problem. For this scenario, we propose a learning method that provably minimizes model error on fair datasets, while blindly training on datasets poisoned with direct additive discrimination. The method is compatible with existing legal systems and provides a solution to the widely discussed issue of protected groups' intersectionality by striking a balance between the protected groups. Technically, the method applies probabilistic interventions, has causal and counterfactual formulations, and is computationally lightweight - it can be used with any supervised learning model to prevent discrimination via proxies while maximizing model accuracy for business necessity.
翻译:监督学习系统使用历史数据进行训练,如果数据受到歧视污染,它们可能无意中学会歧视受保护群体。我们提出,公平学习方法尽管在可能存在歧视的数据集上进行训练,但在公平测试数据集上应表现良好。这种数据集偏移具体体现了特定公平学习方法的应用场景。例如,直接歧视的消除可以表示为特定的数据集偏移问题。针对此场景,我们提出一种学习方法,该方法在盲目使用受直接加性歧视污染的数据集进行训练时,能证明可最小化公平数据集上的模型误差。该方法与现有法律体系兼容,并通过在受保护群体之间取得平衡,为解决广受讨论的受保护群体交叉性问题提供了方案。在技术上,该方法应用概率干预,具有因果和反事实公式,且计算轻量——可与任何监督学习模型结合使用,以防止通过代理变量产生歧视,同时为业务必要性最大化模型准确率。