We propose a scheme for supervised image classification that uses privileged information, in the form of keypoint annotations for the training data, to learn strong models from small and/or biased training sets. Our main motivation is the recognition of animal species for ecological applications such as biodiversity modelling, which is challenging because of long-tailed species distributions due to rare species, and strong dataset biases such as repetitive scene background in camera traps. To counteract these challenges, we propose a visual attention mechanism that is supervised via keypoint annotations that highlight important object parts. This privileged information, implemented as a novel privileged pooling operation, is only required during training and helps the model to focus on regions that are discriminative. In experiments with three different animal species datasets, we show that deep networks with privileged pooling can use small training sets more efficiently and generalize better.
翻译:本文提出一种利用特权信息的监督图像分类方案,该方案将训练数据中的关键点标注作为特权信息,从少量和/或存在偏差的训练集中学习强健模型。我们的主要动机是为生态应用(如生物多样性建模)进行动物物种识别,该任务因稀有物种导致的长尾物种分布,以及相机陷阱中重复场景背景等强数据集偏差而具有挑战性。为应对这些挑战,我们提出一种通过突出重要对象部位的关键点标注进行监督的视觉注意力机制。这种以新型特权池化操作实现的特权信息仅在训练时需要,有助于模型聚焦具有判别力的区域。在三个不同动物物种数据集的实验中,我们证明采用特权池化的深度网络能更高效地利用小规模训练集,并具有更强的泛化能力。