Zero-shot learning (ZSL) aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes. A recent paradigm called transductive zero-shot learning further leverages unlabeled unseen data during training and has obtained impressive results. These methods always synthesize unseen features from attributes through a generative adversarial network to mitigate the bias towards seen classes. However, they neglect the semantic information in the unlabeled unseen data and thus fail to generate high-fidelity attribute-consistent unseen features. To address this issue, we present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process. In particular, we first train an attribute decoder that learns the mapping from visual features to semantic attributes. Then, from the attribute decoder, we obtain pseudo-attributes of unlabeled data and integrate them into the generative model, which helps capture the detailed differences within unseen classes so as to synthesize more discriminative features. Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
翻译:零样本学习旨在通过从已见类中学到的视觉特征与语义属性之间的关系来识别未见类。近年兴起的直推式零样本学习范式进一步在训练中利用未标注的未见数据,取得了显著成果。这类方法通常通过生成对抗网络从属性合成未见特征以缓解对已见类的偏差。然而,现有方法忽视了未标注未见数据中的语义信息,导致无法生成高保真且属性一致的未见特征。为解决该问题,我们提出一种新颖的直推式零样本学习方法,该方法可生成未见数据的语义属性并将其注入生成过程。具体而言,我们首先训练一个属性解码器,学习从视觉特征到语义属性的映射。继而从该解码器获取未标注数据的伪属性并将其集成到生成模型中,这有助于捕获未见类内部的细微差异,从而合成更具判别性的特征。在五个标准基准上的实验表明,我们的方法在零样本学习中取得了当前最优结果。