We study a challenging problem of unsupervised discovery of object landmarks. Many recent methods rely on bottlenecks to generate 2D Gaussian heatmaps however, these are limited in generating informed heatmaps while training, presumably due to the lack of effective structural cues. Also, it is assumed that all predicted landmarks are semantically relevant despite having no ground truth supervision. In the current work, we introduce a consistency-guided bottleneck in an image reconstruction-based pipeline that leverages landmark consistency, a measure of compatibility score with the pseudo-ground truth to generate adaptive heatmaps. We propose obtaining pseudo-supervision via forming landmark correspondence across images. The consistency then modulates the uncertainty of the discovered landmarks in the generation of adaptive heatmaps which rank consistent landmarks above their noisy counterparts, providing effective structural information for improved robustness. Evaluations on five diverse datasets including MAFL, AFLW, LS3D, Cats, and Shoes demonstrate excellent performance of the proposed approach compared to the existing state-of-the-art methods. Our code is publicly available at https://github.com/MamonaAwan/CGB_ULD.
翻译:我们研究了无监督发现物体地标这一具有挑战性的问题。许多近期方法依赖瓶颈来生成二维高斯热力图,然而这些方法在训练过程中生成具有信息量的热力图时存在局限,这可能是由于缺乏有效的结构线索。此外,尽管没有真实标注监督,这些方法仍假设所有预测地标具有语义相关性。在当前工作中,我们引入了一种基于一致性引导瓶颈的图像重建管道,该管道利用地标一致性(一种与伪真实标注的兼容性得分度量)来生成自适应热力图。我们提出通过建立跨图像地标对应关系来获取伪监督。随后,一致性通过调制所发现地标的不确定性来生成自适应热力图,使得一致性地标排名高于噪声地标,从而提供有效的结构信息以提升鲁棒性。在包括MAFL、AFLW、LS3D、Cats和Shoes在内的五个不同数据集上的评估表明,与现有最先进方法相比,所提出的方法表现出色。我们的代码已在https://github.com/MamonaAwan/CGB_ULD开源。