Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embedding. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Comprehensive experiments are performed on three zero-shot image classification benchmarks: NUS-WIDE, Open Images and MS COCO. Our cross-level fusion-based generative approach outperforms the state-of-the-art on all three datasets. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods. The source code is available at https://github.com/akshitac8/Generative_MLZSL.
翻译:多标签零样本学习旨在将图像分类为多个未见类别,这些类别在训练期间没有可用数据。测试样本在泛化变体中还可能包含已见类别。现有方法依赖于从已见类别中学习共享或标签特定的注意力机制。然而,在多标签设置下推理时为未见类别计算可靠的注意力图仍是一项挑战。相反,基于最先进的单标签生成对抗网络(GAN)的方法学习直接从对应的类别属性嵌入合成类别特定的视觉特征。然而,在零样本场景下,利用GAN合成多标签特征仍未被探索。在这项工作中,我们引入了属性级、特征级和跨级(跨越属性和特征级别)的不同融合方法,以从对应的多标签类别嵌入中合成多标签特征。据我们所知,我们的工作是首次在(泛化)零样本设置中解决多标签特征合成问题。我们在三个零样本图像分类基准数据集上进行了全面实验:NUS-WIDE、Open Images和MS COCO。我们基于跨级融合的生成方法在所有三个数据集上均优于现有技术。此外,我们在MS COCO的零样本检测任务中展示了融合方法的泛化能力,取得了优于现有方法的性能。源代码可在https://github.com/akshitac8/Generative_MLZSL获取。