Detection and Evaluation of bias-inducing Features in Machine learning

The cause-to-effect analysis can help us decompose all the likely causes of a problem, such as an undesirable business situation or unintended harm to the individual(s). This implies that we can identify how the problems are inherited, rank the causes to help prioritize fixes, simplify a complex problem and visualize them. In the context of machine learning (ML), one can use cause-to-effect analysis to understand the reason for the biased behavior of the system. For example, we can examine the root causes of biases by checking each feature for a potential cause of bias in the model. To approach this, one can apply small changes to a given feature or a pair of features in the data, following some guidelines and observing how it impacts the decision made by the model (i.e., model prediction). Therefore, we can use cause-to-effect analysis to identify the potential bias-inducing features, even when these features are originally are unknown. This is important since most current methods require a pre-identification of sensitive features for bias assessment and can actually miss other relevant bias-inducing features, which is why systematic identification of such features is necessary. Moreover, it often occurs that to achieve an equitable outcome, one has to take into account sensitive features in the model decision. Therefore, it should be up to the domain experts to decide based on their knowledge of the context of a decision whether bias induced by specific features is acceptable or not. In this study, we propose an approach for systematically identifying all bias-inducing features of a model to help support the decision-making of domain experts. We evaluated our technique using four well-known datasets to showcase how our contribution can help spearhead the standard procedure when developing, testing, maintaining, and deploying fair/equitable machine learning systems.

翻译：因果分析有助于分解问题的所有可能成因，例如不良商业情境或对个体造成的意外伤害。这意味着我们能够识别问题的继承方式、对成因排序以优先修复、简化复杂问题并实现可视化。在机器学习背景下，可利用因果分析理解系统偏置行为的成因。例如，通过检查每个特征是否为模型偏置的潜在诱因，可考察偏置的根本原因。为此，可依据特定准则对数据中的单个特征或特征对施加微小修改，观察其对模型决策（即模型预测）的影响。因此，即使这些特征原本未知，我们也能通过因果分析识别潜在的偏置诱发特征。这具有重要价值，因为现有大多数方法需要预先识别敏感特征进行偏置评估，可能遗漏其他相关偏置特征，故而系统化识别此类特征十分必要。此外，为实现公平结果，通常需要在模型决策中纳入敏感特征。因此，应由领域专家根据决策情境知识判断特定特征诱发的偏置是否可接受。本研究提出系统化识别模型所有偏置诱发特征的方法，以支持领域专家决策。我们采用四个经典数据集评估该技术，展示其如何推动公平/公正机器学习系统在开发、测试、维护与部署中的标准化流程。