Recent works have revealed an essential paradigm in designing loss functions that differentiate individual losses vs. aggregate losses. The individual loss measures the quality of the model on a sample, while the aggregate loss combines individual losses/scores over each training sample. Both have a common procedure that aggregates a set of individual values to a single numerical value. The ranking order reflects the most fundamental relation among individual values in designing losses. In addition, decomposability, in which a loss can be decomposed into an ensemble of individual terms, becomes a significant property of organizing losses/scores. This survey provides a systematic and comprehensive review of rank-based decomposable losses in machine learning. Specifically, we provide a new taxonomy of loss functions that follows the perspectives of aggregate loss and individual loss. We identify the aggregator to form such losses, which are examples of set functions. We organize the rank-based decomposable losses into eight categories. Following these categories, we review the literature on rank-based aggregate losses and rank-based individual losses. We describe general formulas for these losses and connect them with existing research topics. We also suggest future research directions spanning unexplored, remaining, and emerging issues in rank-based decomposable losses.
翻译:近期的研究工作揭示了一个设计损失函数的关键范式,即区分个体损失与聚合损失。个体损失衡量模型在单个样本上的性能,而聚合损失则整合每个训练样本的个体损失/评分值。二者均通过一个共同流程将一组个体值聚合成单一数值。在损失函数设计中,排序顺序反映了个体值之间最基本的关系。此外,可分解性——即损失可被分解为一系列个体项的组合——成为组织损失/评分的重要属性。本综述系统且全面地梳理了机器学习中基于排名的可分解损失。具体而言,我们提出了一种新的损失函数分类体系,从聚合损失与个体损失的视角进行划分。我们识别出构成此类损失的聚合器,这些聚合器属于集合函数的范畴。我们将基于排名的可分解损失划分为八个类别。依据这些类别,我们分别回顾了基于排名的聚合损失与个体损失的相关文献。我们描述了这些损失的一般公式,并将其与现有研究主题建立关联。最后,我们提出了未来研究方向,涵盖基于排名的可分解损失中尚未探索、有待解决及新兴涌现的问题。