Label hierarchy is an important source of external knowledge that can enhance classification performance. However, most existing methods rely on predefined label hierarchies that may not match the data distribution. To address this issue, we propose Simultaneous label hierarchy Exploration And Learning (SEAL), a new framework that explores the label hierarchy by augmenting the observed labels with latent labels that follow a prior hierarchical structure. Our approach uses a 1-Wasserstein metric over the tree metric space as an objective function, which enables us to simultaneously learn a data-driven label hierarchy and perform (semi-)supervised learning. We evaluate our method on several datasets and show that it achieves superior results in both supervised and semi-supervised scenarios and reveals insightful label structures. Our implementation is available at https://github.com/tzq1999/SEAL.
翻译:标签层次是能够提升分类性能的重要外部知识来源。然而,现有方法大多依赖于可能不匹配数据分布的预定义标签层次。针对这一问题,我们提出了同步标签层次探索与学习框架(SEAL),该框架通过使用遵循先验层次结构的潜在标签来扩充观测到的标签,从而探索标签层次结构。我们的方法采用树度量空间上的1-瓦瑟斯坦度量作为目标函数,这使得我们能够同步学习数据驱动的标签层次并执行(半)监督学习。我们在多个数据集上评估了该方法,结果表明其在监督和半监督场景下均取得了优异的效果,并揭示了有洞察力的标签结构。我们的实现代码可在 https://github.com/tzq1999/SEAL 获取。