Counterfactuals and counterfactual reasoning underpin numerous techniques for auditing and understanding artificial intelligence (AI) systems. The traditional paradigm for counterfactual reasoning in this literature is the interventional counterfactual, where hypothetical interventions are imagined and simulated. For this reason, the starting point for causal reasoning about legal protections and demographic data in AI is an imagined intervention on a legally-protected characteristic, such as ethnicity, race, gender, disability, age, etc. We ask, for example, what would have happened had your race been different? An inherent limitation of this paradigm is that some demographic interventions -- like interventions on race -- may not translate into the formalisms of interventional counterfactuals. In this work, we explore a new paradigm based instead on the backtracking counterfactual, where rather than imagine hypothetical interventions on legally-protected characteristics, we imagine alternate initial conditions while holding these characteristics fixed. We ask instead, what would explain a counterfactual outcome for you as you actually are or could be? This alternate framework allows us to address many of the same social concerns, but to do so while asking fundamentally different questions that do not rely on demographic interventions.
翻译:反事实与反事实推理支撑着人工智能系统的审计与理解技术。该领域的传统反事实推理范式是干预性反事实,即想象并模拟假设性干预。因此,关于AI中法律保护与人口统计数据的因果推理起点,是对种族、民族、性别、残障、年龄等受法律保护特征的想象性干预。例如,我们可能会问:“如果你的种族不同,结果会怎样?” 该范式的固有局限在于,某些人口统计学干预(如对种族的干预)可能无法转化为干预性反事实的形式化体系。本文探索一种基于回溯性反事实的新范式:与其想象对受法律保护特征的假设性干预,不如在固定这些特征的前提下,想象替代的初始条件。我们转而追问:“什么因素能为真实的你(或潜在的你)解释反事实结果?” 这一替代性框架能解决许多相同的社会关切问题,同时从根本上提出不依赖人口统计学干预的差异化问题。