Meta-learning is an effective method to handle imbalanced and noisy-label learning, but it depends on a validation set containing randomly selected, manually labelled and balanced distributed samples. The random selection and manual labelling and balancing of this validation set is not only sub-optimal for meta-learning, but it also scales poorly with the number of classes. Hence, recent meta-learning papers have proposed ad-hoc heuristics to automatically build and label this validation set, but these heuristics are still sub-optimal for meta-learning. In this paper, we analyse the meta-learning algorithm and propose new criteria to characterise the utility of the validation set, based on: 1) the informativeness of the validation set; 2) the class distribution balance of the set; and 3) the correctness of the labels of the set. Furthermore, we propose a new imbalanced noisy-label meta-learning (INOLML) algorithm that automatically builds a validation set by maximising its utility using the criteria above. Our method shows significant improvements over previous meta-learning approaches and sets the new state-of-the-art on several benchmarks.
翻译:元学习是处理类别不平衡和噪声标签学习的有效方法,但其依赖于一个包含随机选取、人工标注且类别分布平衡的样本的验证集。这种随机选取、人工标注和平衡验证集的方式不仅对元学习而言是次优的,而且其可扩展性随着类别数量的增加而急剧下降。因此,近期的元学习论文提出了多种启发式方法来自动构建和标注此验证集,但这些启发式方法对元学习而言仍然是次优的。在本文中,我们分析了元学习算法,并基于以下三点提出了新的准则来刻画验证集的效用:1)验证集的信息量;2)验证集的类别分布平衡性;3)验证集标签的正确性。此外,我们提出了一种新的类别不平衡噪声标签元学习算法,该算法通过使用上述准则最大化验证集效用来自动构建验证集。我们的方法相较于以往的元学习方法显示出显著改进,并在多个基准测试中确立了新的最优性能。