The focal-loss has become a widely used alternative to cross-entropy in class-imbalanced classification problems, particularly in computer vision. Despite its empirical success, a systematic information-theoretic study of the focal-loss remains incomplete. In this work, we adopt a distributional viewpoint and study the focal-entropy, a focal-loss analogue of the cross-entropy. Our analysis establishes conditions for finiteness, convexity, and continuity of the focal-entropy, and provides various asymptotic characterizations. We prove the existence and uniqueness of the focal-entropy minimizer, describe its structure, and show that it can depart significantly from the data distribution. In particular, we rigorously show that the focal-loss amplifies mid-range probabilities, suppresses high-probability outcomes, and, under extreme class imbalance, induces an over-suppression regime in which very small probabilities are further diminished. These results, which are also experimentally validated, offer a theoretical foundation for understanding the focal-loss and clarify the trade-offs that it introduces when applied to imbalanced learning tasks.
翻译:焦点损失已成为类别不平衡分类问题中广泛使用的交叉熵替代方案,尤其在计算机视觉领域。尽管其经验上取得了成功,但对焦点损失的系统性信息论研究仍不完整。在本工作中,我们采用分布视角研究焦点熵——即焦点损失在交叉熵意义上的对应概念。我们的分析确立了焦点熵的有界性、凸性和连续性条件,并提供了多种渐近特征描述。我们证明了焦点熵最小化器的存在性与唯一性,描述了其结构,并表明其可能显著偏离数据分布。特别地,我们严格证明了焦点损失会放大中等概率事件、抑制高概率结果,且在极端类别不平衡情况下会引发过度抑制机制,使极低概率进一步衰减。这些经过实验验证的结果为理解焦点损失提供了理论基础,并阐明了其在应用于不平衡学习任务时所引入的权衡关系。