In the dominant paradigm for designing equitable machine learning systems, one works to ensure that model predictions satisfy various fairness criteria, such as parity in error rates across race, gender, and other legally protected traits. That approach, however, typically ignores the downstream decisions and outcomes that predictions affect, and, as a result, can induce unexpected harms. Here we present an alternative framework for fairness that directly anticipates the consequences of decisions. Stakeholders first specify preferences over the possible outcomes of an algorithmically informed decision-making process. For example, lenders may prefer extending credit to those most likely to repay a loan, while also preferring similar lending rates across neighborhoods. One then searches the space of decision policies to maximize the specified utility. We develop and describe a method for efficiently learning these optimal policies from data for a large family of expressive utility functions, facilitating a more holistic approach to equitable decision-making.
翻译:在主流的公平机器学习系统设计范式中,人们致力于确保模型预测满足各种公平性准则,例如基于种族、性别及其他受法律保护特征的错误率均等性。然而,这种方法通常忽视了预测所影响的下游决策和结果,从而可能引发意外危害。本文提出了一种替代性的公平框架,该框架直接预判决策的后果。利益相关者首先对基于算法信息做出的决策过程中可能产生的结果设定偏好。例如,贷款机构可能倾向于将信贷发放给最有可能偿还贷款的人,同时也希望不同社区的贷款利率保持相似。随后,我们通过搜索决策策略空间来最大化指定的效用。我们开发并描述了一种方法,能够从数据中高效学习一大类可表达效用函数下的最优策略,从而促进一种更全面的公平决策方法。