We propose Probabilistic Warp Consistency, a weakly-supervised learning objective for semantic matching. Our approach directly supervises the dense matching scores predicted by the network, encoded as a conditional probability distribution. We first construct an image triplet by applying a known warp to one of the images in a pair depicting different instances of the same object class. Our probabilistic learning objectives are then derived using the constraints arising from the resulting image triplet. We further account for occlusion and background clutter present in real image pairs by extending our probabilistic output space with a learnable unmatched state. To supervise it, we design an objective between image pairs depicting different object classes. We validate our method by applying it to four recent semantic matching architectures. Our weakly-supervised approach sets a new state-of-the-art on four challenging semantic matching benchmarks. Lastly, we demonstrate that our objective also brings substantial improvements in the strongly-supervised regime, when combined with keypoint annotations.
翻译:我们提出了概率性形变一致性(Probabilistic Warp Consistency),一种用于语义匹配的弱监督学习目标。我们的方法直接监督网络预测的密集匹配分数,并将其编码为条件概率分布。首先,通过对描绘同一物体类别不同实例的图像对中的一张图像施加已知形变,构建图像三元组。然后,利用由此产生的图像三元组形成的约束,推导出我们的概率性学习目标。我们进一步通过引入可学习的未匹配状态来扩展概率输出空间,以处理真实图像对中的遮挡和背景杂波。为监督此状态,我们设计了针对不同物体类别图像对的目标函数。通过将我们的方法应用于四种最新的语义匹配架构,我们进行了验证。我们的弱监督方法在四个具有挑战性的语义匹配基准上树立了新的最先进性能。最后,我们证明当结合关键点标注时,我们的目标函数在强监督模式下也带来了显著的性能提升。