In this work, we present a robust approach for joint part and object segmentation. Specifically, we reformulate object and part segmentation as an optimization problem and build a hierarchical feature representation including pixel, part, and object-level embeddings to solve it in a bottom-up clustering manner. Pixels are grouped into several clusters where the part-level embeddings serve as cluster centers. Afterwards, object masks are obtained by compositing the part proposals. This bottom-up interaction is shown to be effective in integrating information from lower semantic levels to higher semantic levels. Based on that, our novel approach Compositor produces part and object segmentation masks simultaneously while improving the mask quality. Compositor achieves state-of-the-art performance on PartImageNet and Pascal-Part by outperforming previous methods by around 0.9% and 1.3% on PartImageNet, 0.4% and 1.7% on Pascal-Part in terms of part and object mIoU and demonstrates better robustness against occlusion by around 4.4% and 7.1% on part and object respectively. Code will be available at https://github.com/TACJu/Compositor.
翻译:本文提出一种用于联合部件与物体分割的鲁棒方法。具体而言,我们将物体与部件分割重新表述为优化问题,并构建包含像素级、部件级和物体级嵌入的分层特征表示,通过自底向上的聚类方式求解该问题。像素被分组为若干聚类,其中部件级嵌入作为聚类中心。随后,物体掩码通过组合部件提议得到。这种自底向上的交互被证明能有效整合从低语义层级到高语义层级的信息。基于此,我们提出的新方法Compositor可同时生成部件与物体分割掩码,同时提升掩码质量。Compositor在PartImageNet和Pascal-Part数据集上达到了业界最优性能:在部件与物体mIoU指标上,分别比先前方法提升约0.9%和1.3%(PartImageNet)、0.4%和1.7%(Pascal-Part);并在遮挡鲁棒性方面表现更优,部件与物体分割性能分别提升约4.4%和7.1%。代码将开源至https://github.com/TACJu/Compositor。