Machine Learning models are increasingly used for decision making, in particular in high-stakes applications such as credit scoring, medicine or recidivism prediction. However, there are growing concerns about these models with respect to their lack of interpretability and the undesirable biases they can generate or reproduce. While the concepts of interpretability and fairness have been extensively studied by the scientific community in recent years, few works have tackled the general multi-class classification problem under fairness constraints, and none of them proposes to generate fair and interpretable models for multi-class classification. In this paper, we use Mixed-Integer Linear Programming (MILP) techniques to produce inherently interpretable scoring systems under sparsity and fairness constraints, for the general multi-class classification setup. Our work generalizes the SLIM (Supersparse Linear Integer Models) framework that was proposed by Rudin and Ustun to learn optimal scoring systems for binary classification. The use of MILP techniques allows for an easy integration of diverse operational constraints (such as, but not restricted to, fairness or sparsity), but also for the building of certifiably optimal models (or sub-optimal models with bounded optimality gap).
翻译:机器学习模型越来越多地用于决策制定,特别是在信贷评分、医疗或累犯预测等高风险应用中。然而,这些模型因其缺乏可解释性以及可能产生或复现的不良偏见而日益引发关注。尽管近年来科学界对可解释性和公平性概念进行了广泛研究,但很少有工作涉及公平约束下的通用多类别分类问题,且没有研究提出为多类别分类生成公平且可解释的模型。本文采用混合整数线性规划(MILP)技术,在稀疏性和公平性约束下,为通用多类别分类设置生成内在可解释的评分系统。我们的工作推广了Rudin和Ustun提出的SLIM(超稀疏线性整数模型)框架,该框架用于学习二元分类的最优评分系统。MILP技术的应用不仅便于整合多种运营约束(例如但不限于公平性或稀疏性),还能构建可证明最优的模型(或具有有界最优性差距的次优模型)。