Compositional Zero-Shot Learning (CZSL) aims to transfer knowledge from seen state-object pairs to novel unseen pairs. In this process, visual bias caused by the diverse interrelationship of state-object combinations blurs their visual features, hindering the learning of distinguishable class prototypes. Prevailing methods concentrate on disentangling states and objects directly from visual features, disregarding potential enhancements that could arise from a data viewpoint. Experimentally, we unveil the results caused by the above problem closely approximate the long-tailed distribution. As a solution, we transform CZSL into a proximate class imbalance problem. We mathematically deduce the role of class prior within the long-tailed distribution in CZSL. Building upon this insight, we incorporate visual bias caused by compositions into the classifier's training and inference by estimating it as a proximate class prior. This enhancement encourages the classifier to acquire more discernible class prototypes for each composition, thereby achieving more balanced predictions. Experimental results demonstrate that our approach elevates the model's performance to the state-of-the-art level, without introducing additional parameters. Our code is available at \url{https://github.com/LanchJL/ProLT-CZSL}.
翻译:组合零样本学习(CZSL)旨在将知识从已知的状态-对象对泛化至未见的新颖组合。在该过程中,由状态-对象组合间的多样交互关系导致的视觉偏差模糊了其视觉特征,阻碍了可区分类别原型的学习。现有主流方法集中于直接从视觉特征中解耦状态与对象,忽视了从数据视角可能带来的潜在改进。实验发现,上述问题引发的后果近似呈现长尾分布。为此,我们将CZSL转化为一个近端类别不平衡问题。我们从数学上推导了类先验在CZSL长尾分布中的作用,并基于此洞察,通过将组合导致的视觉偏差估计为近端类先验,将其融入分类器的训练与推理过程。这一增强促使分类器为每个组合学习更具判别性的类别原型,从而实现更均衡的预测。实验结果表明,我们的方法在不引入额外参数的情况下,将模型性能提升至当前最优水平。我们的代码开源在 \url{https://github.com/LanchJL/ProLT-CZSL}。