Deep metric learning (DML) involves training a network to learn a semantically meaningful representation space. Many current approaches mine n-tuples of examples and model interactions within each tuplets. We present a novel, compositional DML model, inspired by electrostatic fields in physics that, instead of in tuples, represents the influence of each example (embedding) by a continuous potential field, and superposes the fields to obtain their combined global potential field. We use attractive/repulsive potential fields to represent interactions among embeddings from images of the same/different classes. Contrary to typical learning methods, where mutual influence of samples is proportional to their distance, we enforce reduction in such influence with distance, leading to a decaying field. We show that such decay helps improve performance on real world datasets with large intra-class variations and label noise. Like other proxy-based methods, we also use proxies to succinctly represent sub-populations of examples. We evaluate our method on three standard DML benchmarks- Cars-196, CUB-200-2011, and SOP datasets where it outperforms state-of-the-art baselines.
翻译:深度度量学习(DML)旨在训练网络以学习具有语义意义的表示空间。当前许多方法通过挖掘样本的n元组并在每个元组内建模样本间的相互作用。受物理学中静电场的启发,我们提出了一种新颖的组合式DML模型。该模型不再以元组为单位,而是通过连续势场表示每个样本(嵌入)的影响,并通过叠加这些场来获得其组合的全局势场。我们使用吸引/排斥势场来表示来自相同/不同类别图像的嵌入之间的相互作用。与典型学习方法中样本间相互影响与其距离成正比不同,我们强制要求这种影响随距离增加而减弱,从而形成衰减场。我们证明,这种衰减有助于在具有较大类内差异和标签噪声的真实世界数据集中提升性能。与其他基于代理的方法类似,我们也使用代理来简洁地表示样本的子群体。我们在三个标准DML基准数据集——Cars-196、CUB-200-2011和SOP上评估了我们的方法,结果表明其性能优于当前最先进的基线模型。