Traditional classifiers treat all labels as mutually independent, thereby considering all negative classes to be equally incorrect. This approach fails severely in many real-world scenarios, where a known semantic hierarchy defines a partial order of preferences over negative classes. While hierarchy-aware feature representations have shown promise in mitigating this problem, their performance is typically assessed using metrics like MS and AHD. In this paper, we highlight important shortcomings in existing hierarchical evaluation metrics, demonstrating that they are often incapable of measuring true hierarchical performance. Our analysis reveals that existing methods learn sub-optimal hierarchical representations, despite competitive MS and AHD scores. To counter these issues, we introduce Hier-COS, a novel framework for unified hierarchy-aware fine-grained and hierarchical multi-level classification. We show that Hier-COS is theoretically guaranteed to be consistent with the given hierarchy tree. Furthermore, our framework implicitly adapts the learning capacity for different classes based on their position within the hierarchy tree-a vital property absent in existing methods. Finally, to address the limitations of evaluation metrics, we propose HOPS, a ranking-based metric that demonstrably overcomes the deficiencies of current evaluation standards. We benchmark Hier-COS on four challenging datasets, including the deep and imbalanced tieredImageNet-H and iNaturalist-19. Through extensive experiments, we demonstrate that Hier-COS achieves SOTA across all hierarchical metrics for every dataset, while simultaneously beating the top-1 accuracy in all but one case. Lastly, we show that Hier-COS can effectively learn to transform the frozen features extracted from a pretrained backbone (ViT) to be hierarchy-aware, yielding substantial benefits for hierarchical classification performance.
翻译:传统分类器将所有标签视为相互独立,从而认为所有负类均同等错误。这种方法在许多现实场景中严重失效,因为已知的语义层次结构定义了负类之间的偏序偏好关系。尽管层次感知的特征表示在缓解此问题上展现出潜力,但其性能通常使用MS和AHD等指标进行评估。本文指出现有层次评估指标存在重要缺陷,证明其往往无法衡量真实的层次性能。我们的分析表明,尽管现有方法在MS和AHD得分上具有竞争力,但学习到的层次表示仍处于次优状态。为解决这些问题,我们提出Hier-COS——一个用于统一层次感知细粒度分类与多层次分类的新型框架。我们证明Hier-COS在理论上能保证与给定层次树结构的一致性。此外,本框架能根据类别在层次树中的位置隐式调整其学习容量,这是现有方法缺失的关键特性。最后,为弥补评估指标的局限性,我们提出基于排序的度量标准HOPS,该指标可验证地克服了当前评估标准的缺陷。我们在四个具有挑战性的数据集上对Hier-COS进行基准测试,包括深度不平衡的tieredImageNet-H和iNaturalist-19。大量实验表明,Hier-COS在所有数据集的层次度量指标上均达到最优性能,同时在除一个案例外的所有情况下均超越最高Top-1准确率。最后,我们证明Hier-COS能有效学习将预训练主干网络(ViT)提取的冻结特征转换为层次感知表示,从而显著提升层次分类性能。