Machine learning models are often used to automate or support decisions in applications such as lending and hiring. In such settings, consumer protection rules mandate that we provide a list of "principal reasons" to consumers who receive adverse decisions. In practice, lenders and employers identify principal reasons by returning the top-scoring features from a feature attribution method. In this work, we study how such practices align with one of the underlying goals of consumer protection - recourse - i.e., educating individuals on how they can attain a desired outcome. We show that standard attribution methods can mislead individuals by highlighting reasons without recourse - i.e., by presenting consumers with features that cannot be changed to achieve recourse. We propose to address these issues by scoring features on the basis of responsiveness - i.e., the probability that an individual can attain a desired outcome by changing a specific feature. We develop efficient methods to compute responsiveness scores for any model and any dataset under complex actionability constraints. We present an extensive empirical study on the responsiveness of explanations in lending and demonstrate how responsiveness scores can be used to construct feature-highlighting explanations that lead to recourse and mitigate harm by flagging instances with fixed predictions.
翻译:机器学习模型常被用于贷款审批和招聘等场景中的决策自动化或决策支持。在此类应用中,消费者保护法规要求我们必须向收到不利决策的消费者提供"主要理由"清单。实践中,贷款机构和雇主通常通过返回特征归因方法中得分最高的特征来确定主要理由。本研究探讨了此类实践如何与消费者保护的根本目标之一——追索权——相协调,即指导个体如何获得期望的结果。我们证明标准归因方法可能通过强调无追索价值的理由误导个体——即向消费者展示那些无法通过改变来实现追索的特征。我们提出通过基于响应度的特征评分来解决这些问题——即个体通过改变特定特征获得期望结果的概率。我们开发了高效的计算方法,可在复杂的可操作性约束下为任意模型和任意数据集计算响应度评分。我们在信贷场景中对解释的响应度进行了广泛的实证研究,并展示了如何利用响应度评分构建特征突出型解释,这些解释既能引导追索,又能通过标记具有固定预测的实例来减轻损害。