In contrast to the standard learning paradigm where all classes can be observed in training data, learning with augmented classes (LAC) tackles the problem where augmented classes unobserved in the training data may emerge in the test phase. Previous research showed that given unlabeled data, an unbiased risk estimator (URE) can be derived, which can be minimized for LAC with theoretical guarantees. However, this URE is only restricted to the specific type of one-versus-rest loss functions for multi-class classification, making it not flexible enough when the loss needs to be changed with the dataset in practice. In this paper, we propose a generalized URE that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees, given unlabeled data for LAC. To alleviate the issue of negative empirical risk commonly encountered by previous studies, we further propose a novel risk-penalty regularization term. Experiments demonstrate the effectiveness of our proposed method.
翻译:在标准学习范式中,训练数据中所有类别均可观测到,而增强类别学习(LAC)则应对训练数据中未观测到的增强类别可能在测试阶段出现的问题。先前研究表明,在给定未标注数据的情况下,可推导出无偏风险估计量(URE),该估计量可通过理论保证最小化以实现LAC。然而,这种URE仅局限于多分类中特定类型的一对多损失函数,当实际应用中需根据数据集调整损失函数时,其灵活性不足。本文提出一种广义URE,在给定未标注数据用于LAC时,可兼容任意损失函数并保持理论保证。为缓解先前研究中常见的负经验风险问题,我们进一步提出一种新颖的风险惩罚正则化项。实验证明了所提方法的有效性。