Machine learning models routinely automate decisions in applications like lending and hiring. In such settings, consumer protection rules require companies that deploy models to explain predictions to decision subjects. These rules are motivated, in part, by the belief that explanations can promote recourse by revealing information that individuals can use to contest or improve their outcomes. In practice, many companies comply with these rules by providing individuals with a list of the most important features for their prediction, which they identify based on feature importance scores from feature attribution methods such as SHAP or LIME. In this work, we show how these practices can undermine consumers by highlighting features that would not lead to an improved outcome and by explaining predictions that cannot be changed. We propose to address these issues by highlighting features based on their responsiveness score -- i.e., the probability that an individual can attain a target prediction by changing a specific feature. We develop efficient methods to compute responsiveness scores for any model and any dataset. We conduct an extensive empirical study on the responsiveness of explanations in lending. Our results show that standard practices in consumer finance can backfire by presenting consumers with reasons without recourse, and demonstrate how our approach improves consumer protection by highlighting responsive features and identifying fixed predictions.
翻译:机器学习模型在贷款和招聘等应用中已常规化地实现决策自动化。在此类场景下,消费者保护法规要求部署模型的企业向决策对象解释预测结果。这些法规的制定动机部分源于一种理念:解释能够通过揭示个体可用于质疑或改善其决策结果的信息,从而促进补救措施的实施。实践中,许多企业通过向个体提供对其预测最重要的特征列表来遵守这些规定,这些特征是基于SHAP或LIME等特征归因方法计算的特征重要性得分确定的。本研究揭示了此类实践可能损害消费者权益的两种机制:其一是突出那些无法带来改善结果的特征,其二是解释那些无法改变的预测。为应对这些问题,我们提出基于特征响应度评分来突出关键特征——即个体通过改变特定特征获得目标预测的概率。我们开发了适用于任意模型与数据集的高效响应度评分计算方法。通过对贷款领域解释响应度的广泛实证研究,我们发现消费金融领域的标准实践可能因向消费者提供无法实现补救的解释理由而产生反效果,并论证了我们的方法如何通过突出可响应特征和识别不可变预测来增强消费者保护。