Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet). The key insight is to firstly classify one primitive and then specifies the predicted class as a priori for guiding another primitive recognition in a cascaded fashion. To this end, CSCNet constructs Attribute-to-Object and Object-to-Attribute cascaded branches, in addition to a composition branch modeling the two primitives as a whole. Notably, we devise a parametric classifier (ParamCls) to improve the matching between visual and semantic embeddings. By improving the A-O disentanglement, our framework achieves superior results than previous competitive methods.
翻译:属性与对象(A-O)解耦是组合零样本学习(CZSL)中一项基础且关键的问题,其目标在于基于先验知识识别新颖的A-O组合。现有的基于解耦表示学习的方法忽略了A-O原语对之间的上下文依赖关系。受此启发,我们提出一种新颖的CZSL A-O解耦框架,即类指定级联网络(CSCNet)。其核心思想在于:首先分类一个原语,然后以级联方式将预测类别作为先验信息指导另一原语的识别。为此,CSCNet构建了属性到对象和对象到属性级联分支,以及将两个原语整体建模的组合分支。值得注意的是,我们设计了一种参数化分类器(ParamCls),以提升视觉嵌入与语义嵌入之间的匹配度。通过改进A-O解耦,我们的框架取得了优于先前竞争方法的卓越结果。