Multi-Label Learning with Stronger Consistency Guarantees

We present a detailed study of surrogate losses and algorithms for multi-label learning, supported by $H$-consistency bounds. We first show that, for the simplest form of multi-label loss (the popular Hamming loss), the well-known consistent binary relevance surrogate suffers from a sub-optimal dependency on the number of labels in terms of $H$-consistency bounds, when using smooth losses such as logistic losses. Furthermore, this loss function fails to account for label correlations. To address these drawbacks, we introduce a novel surrogate loss, multi-label logistic loss, that accounts for label correlations and benefits from label-independent $H$-consistency bounds. We then broaden our analysis to cover a more extensive family of multi-label losses, including all common ones and a new extension defined based on linear-fractional functions with respect to the confusion matrix. We also extend our multi-label logistic losses to more comprehensive multi-label comp-sum losses, adapting comp-sum losses from standard classification to the multi-label learning. We prove that this family of surrogate losses benefits from $H$-consistency bounds, and thus Bayes-consistency, across any general multi-label loss. Our work thus proposes a unified surrogate loss framework benefiting from strong consistency guarantees for any multi-label loss, significantly expanding upon previous work which only established Bayes-consistency and for specific loss functions. Additionally, we adapt constrained losses from standard classification to multi-label constrained losses in a similar way, which also benefit from $H$-consistency bounds and thus Bayes-consistency for any multi-label loss. We further describe efficient gradient computation algorithms for minimizing the multi-label logistic loss.

翻译：我们基于$H$-一致性边界，对多标签学习的替代损失函数及算法进行了系统性研究。首先证明，对于最简单的多标签损失形式（广泛使用的汉明损失），当采用逻辑损失等平滑损失时，著名的"二元关联"一致性替代损失在$H$-一致性边界上会表现出关于标签数量的次优依赖性。此外，该损失函数未能考虑标签间的相关性。为克服这些缺陷，我们提出了一种新颖的替代损失函数——多标签逻辑损失，该函数不仅能够建模标签相关性，而且享有与标签数量无关的$H$-一致性边界。随后，我们将分析拓展至更广泛的多标签损失函数族，涵盖所有常见损失类型以及基于混淆矩阵线性分式函数定义的新扩展形式。我们还将多标签逻辑损失推广至更全面的多标签复合求和损失，将标准分类中的复合求和损失适配至多标签学习场景。理论证明表明，对于任意广义多标签损失，该替代损失函数族均具有$H$-一致性边界，进而获得贝叶斯一致性。本研究由此构建了一个统一的替代损失框架，该框架可为任意多标签损失提供强一致性保证，显著拓展了先前仅针对特定损失函数建立贝叶斯一致性的研究成果。此外，我们以类似方式将标准分类中的约束损失适配为多标签约束损失，该损失族同样对任意多标签损失具有$H$-一致性边界及贝叶斯一致性。最后，我们进一步提出了用于最小化多标签逻辑损失的高效梯度计算算法。