Image recognition is an essential baseline for deep metric learning. Hierarchical knowledge about image classes depicts inter-class similarities or dissimilarities. Effective fusion of hierarchical knowledge about image classes to enhance image recognition remains a challenging topic to advance. In this paper, we propose a novel deep metric learning based method to effectively fuse hierarchical prior knowledge about image classes and enhance image recognition performances in an end-to-end supervised regression manner. Existing deep metric learning incorporated image classification mainly exploits qualitative relativity between image classes, i.e., whether sampled images are from the same class. A new triplet loss function term that exploits quantitative relativity and aligns distances in model latent space with those in knowledge space is also proposed and incorporated in the proposed dual-modality fusion method. Experimental results indicate that the proposed method enhanced image recognition performances and outperformed baseline and existing methods on CIFAR-10, CIFAR-100, Mini-ImageNet, and ImageNet-1K datasets.
翻译:图像识别是深度度量学习的重要基础。关于图像类别的层次化知识描述了类间相似性或差异性。如何有效融合关于图像类别的层次化知识以增强图像识别,仍是推动该领域发展的一个具有挑战性的课题。本文提出一种新颖的基于深度度量学习的方法,以端到端的监督回归方式,有效融合关于图像类别的层次化先验知识,并提升图像识别性能。现有的结合图像分类的深度度量学习方法主要利用图像类别间的定性关联,即采样图像是否属于同一类别。本文还提出了一种新的三元组损失函数项,该函数项利用定量关联,并使模型潜在空间中的距离与知识空间中的距离对齐,该损失项被纳入所提出的双模态融合方法中。实验结果表明,所提方法提升了图像识别性能,在CIFAR-10、CIFAR-100、Mini-ImageNet和ImageNet-1K数据集上优于基线方法和现有方法。