Supervised classification techniques use training samples to learn a classification rule with small expected 0-1 loss (error probability). Conventional methods enable tractable learning and provide out-of-sample generalization by using surrogate losses instead of the 0-1 loss and considering specific families of rules (hypothesis classes). This paper presents minimax risk classifiers (MRCs) that minize the worst-case 0-1 loss with respect to uncertainty sets of distributions that can include the underlying distribution, with a tunable confidence. We show that MRCs can provide tight performance guarantees at learning and are strongly universally consistent using feature mappings given by characteristic kernels. The paper also proposes efficient optimization techniques for MRC learning and shows that the methods presented can provide accurate classification together with tight performance guarantees in practice.
翻译:监督分类技术利用训练样本学习期望0-1损失(错误概率)较小的分类规则。传统方法通过采用替代损失函数替代0-1损失,并考虑特定的规则族(假设类),从而实现了可处理的分类学习并提供了样本外泛化能力。本文提出了极小极大风险分类器,该分类器通过优化与不确定性分布集合(可包含真实分布且置信度可调)最坏情况下的0 1损失,实现了极小极大风险。我们证明,该分类器能在学习过程中提供严格性能保证,并且当采用特征核给出的特征映射时具有强普适一致性。本文还提出了针对极小极大风险分类学习的高效优化技术,并表明所提出方法能在实际应用中提供精确分类的同时兼具严格性能保证。