Modern machine learning approaches to classification, including AdaBoost, support vector machines, and deep neural networks, utilize surrogate loss techniques to circumvent the computational complexity of minimizing empirical classification risk. These techniques are also useful for causal policy learning problems, since estimation of individualized treatment rules can be cast as a weighted (cost-sensitive) classification problem. Consistency of the surrogate loss approaches studied in Zhang (2004) and Bartlett et al. (2006) crucially relies on the assumption of correct specification, meaning that the specified set of classifiers is rich enough to contain a first-best classifier. This assumption is, however, less credible when the set of classifiers is constrained by interpretability or fairness, leaving the applicability of surrogate loss based algorithms unknown in such second-best scenarios. This paper studies consistency of surrogate loss procedures under a constrained set of classifiers without assuming correct specification. We show that in the setting where the constraint restricts the classifier's prediction set only, hinge losses (i.e., $\ell_1$-support vector machines) are the only surrogate losses that preserve consistency in second-best scenarios. If the constraint additionally restricts the functional form of the classifier, consistency of a surrogate loss approach is not guaranteed even with hinge loss. We therefore characterize conditions for the constrained set of classifiers that can guarantee consistency of hinge risk minimizing classifiers. Exploiting our theoretical results, we develop robust and computationally attractive hinge loss based procedures for a monotone classification problem.
翻译:现代机器学习分类方法,包括AdaBoost、支持向量机和深度神经网络,通过使用替代损失技术来规避最小化经验分类风险的计算复杂性。这些技术对因果策略学习问题同样有效,因为个体化治疗规则的估计可被转化为加权(成本敏感)分类问题。Zhang (2004) 与 Bartlett 等 (2006) 所研究的替代损失方法的一致性,关键依赖于正确设定的假设,即所指定的分类器集合足够丰富以包含最优分类器。然而,当分类器集合因可解释性或公平性而受到约束时,该假设的可信度降低,导致基于替代损失的算法在次优场景中的适用性未知。本文研究了在约束分类器集合下替代损失方法的一致性,且不依赖正确设定的假设。我们证明:当约束仅限制分类器的预测集时,铰链损失(即 $\ell_1$-支持向量机)是唯一能在次优场景中保持一致性的替代损失。若约束进一步限制分类器的函数形式,即使采用铰链损失,替代损失方法的一致性也无法得到保证。为此,我们刻画了能保证铰链风险最小化分类器一致性的约束分类器集合的条件。基于理论结果,我们为单调分类问题开发了基于铰链损失的鲁棒且计算高效的算法。