This paper proposes the use of causal modeling to detect and mitigate algorithmic bias that is nonlinear in the protected attribute. We provide a general overview of our approach. We use the German Credit data set, which is available for download from the UC Irvine Machine Learning Repository, to develop (1) a prediction model, which is treated as a black box, and (2) a causal model for bias mitigation. In this paper, we focus on age bias and the problem of binary classification. We show that the probability of getting correctly classified as "low risk" is lowest among young people. The probability increases with age nonlinearly. To incorporate the nonlinearity into the causal model, we introduce a higher order polynomial term. Based on the fitted causal model, the de-biased probability estimates are computed, showing improved fairness with little impact on overall classification accuracy. Causal modeling is intuitive and, hence, its use can enhance explicability and promotes trust among different stakeholders of AI.
翻译:本文提出使用因果建模来检测并缓解受保护属性中呈现非线性的算法偏差。我们概述了该方法的一般框架。利用可从中加州大学欧文分校机器学习数据库下载的德国信用数据集,我们构建了(1)作为黑箱的预测模型,以及(2)用于偏差缓解的因果模型。本文聚焦于年龄偏差与二元分类问题。研究表明,被正确分类为"低风险"的概率在青年群体中最低,且该概率随年龄呈非线性增长。为将非线性纳入因果模型,我们引入了高阶多项式项。基于拟合后的因果模型计算去偏概率估计值,结果表明在保持整体分类精度基本不变的前提下,公平性得到提升。因果建模具有直观性,因此其应用可增强可解释性,并提升AI各利益相关方之间的信任。