It is well-known that zero-shot learning (ZSL) can suffer severely from the problem of domain shift, where the true and learned data distributions for the unseen classes do not match. Although transductive ZSL (TZSL) attempts to improve this by allowing the use of unlabelled examples from the unseen classes, there is still a high level of distribution shift. We propose a novel TZSL model (named as Bi-VAEGAN), which largely improves the shift by a strengthened distribution alignment between the visual and auxiliary spaces. The key proposal of the model design includes (1) a bi-directional distribution alignment, (2) a simple but effective L_2-norm based feature normalization approach, and (3) a more sophisticated unseen class prior estimation approach. In benchmark evaluation using four datasets, Bi-VAEGAN achieves the new state of the arts under both the standard and generalized TZSL settings. Code could be found at https://github.com/Zhicaiwww/Bi-VAEGAN
翻译:众所周知,零样本学习(ZSL)会严重受到域偏移问题的影响,即未见类别的真实数据分布与学习到的数据分布不匹配。尽管直推式零样本学习(TZSL)试图通过允许使用未见类别的无标注样本来改善这一问题,但分布偏移仍然较为严重。本文提出了一种新型的TZSL模型(命名为Bi-VAEGAN),通过增强视觉空间与辅助空间之间的分布对齐,大幅改善了偏移问题。模型设计的关键创新包括:(1)双向分布对齐;(2)一种简单但有效的基于L_2范数的特征归一化方法;(3)一种更精细的未见类别先验估计方法。在四个数据集上的基准评估中,Bi-VAEGAN在标准TZSL和广义TZSL设置下均实现了最新的最优性能。代码可在https://github.com/Zhicaiwww/Bi-VAEGAN获取。