In many real-world prediction tasks, class labels contain information about the relative order between labels that are not captured by commonly used loss functions such as multicategory cross-entropy. Recently, the preference for unimodal distributions in the output space has been incorporated into models and loss functions to account for such ordering information. However, current approaches rely on heuristics that lack a theoretical foundation. Here, we propose two new approaches to incorporate the preference for unimodal distributions into the predictive model. We analyse the set of unimodal distributions in the probability simplex and establish fundamental properties. We then propose a new architecture that imposes unimodal distributions and a new loss term that relies on the notion of projection in a set to promote unimodality. Experiments show the new architecture achieves top-2 performance, while the proposed new loss term is very competitive while maintaining high unimodality.
翻译:在许多现实世界的预测任务中,类别标签包含标签间相对顺序的信息,而常用的损失函数(如多类别交叉熵)无法捕获这些信息。最近,输出空间中单峰分布的偏好已被整合到模型和损失函数中,以利用此类排序信息。然而,当前方法依赖于缺乏理论基础的启发式方法。在此,我们提出两种将单峰分布偏好融入预测模型的新方法。我们分析了概率单纯形中单峰分布集合并建立了基本性质。随后,我们提出了一种强制实现单峰分布的新架构,以及一种基于集合投影概念以促进单峰性的新损失项。实验表明,新架构实现了前两名性能,而所提出的新损失项在保持高单峰性的同时极具竞争力。