This paper presents a comprehensive analysis of the growth rate of $H$-consistency bounds (and excess error bounds) for various surrogate losses used in classification. We prove a square-root growth rate near zero for smooth margin-based surrogate losses in binary classification, providing both upper and lower bounds under mild assumptions. This result also translates to excess error bounds. Our lower bound requires weaker conditions than those in previous work for excess error bounds, and our upper bound is entirely novel. Moreover, we extend this analysis to multi-class classification with a series of novel results, demonstrating a universal square-root growth rate for smooth comp-sum and constrained losses, covering common choices for training neural networks in multi-class classification. Given this universal rate, we turn to the question of choosing among different surrogate losses. We first examine how $H$-consistency bounds vary across surrogates based on the number of classes. Next, ignoring constants and focusing on behavior near zero, we identify minimizability gaps as the key differentiating factor in these bounds. Thus, we thoroughly analyze these gaps, to guide surrogate loss selection, covering: comparisons across different comp-sum losses, conditions where gaps become zero, and general conditions leading to small gaps. Additionally, we demonstrate the key role of minimizability gaps in comparing excess error bounds and $H$-consistency bounds.
翻译:本文对分类任务中各类代理损失函数的$H$一致性界(以及超额误差界)的增长率进行了全面分析。我们证明了二元分类中基于间隔的光滑代理损失函数在零点附近具有平方根增长率,并在温和假设下给出了上界与下界。该结果同样适用于超额误差界。我们的下界条件较先前超额误差界研究中的条件更弱,而上界结果则完全属于首次提出。此外,我们将该分析拓展至多分类问题,获得了一系列创新性结果:证明了光滑复合求和损失与约束损失具有普适的平方根增长率,这涵盖了多分类任务中神经网络训练的常用损失函数选择。基于此普适增长率,我们进一步探讨了不同代理损失函数的选择问题。首先考察了$H$一致性界如何随类别数量在不同代理损失函数间变化。接着,忽略常数项并聚焦于零点附近的行为特征,我们发现可最小化间隙是导致这些界产生差异的关键因素。因此,我们系统分析了这些间隙以指导代理损失函数的选择,具体包括:不同复合求和损失函数的比较、间隙为零的条件、以及导致小间隙的一般性条件。此外,我们论证了可最小化间隙在比较超额误差界与$H$一致性界时的核心作用。