Global average pooling (GAP) is a popular component in deep metric learning (DML) for aggregating features. Its effectiveness is often attributed to treating each feature vector as a distinct semantic entity and GAP as a combination of them. Albeit substantiated, such an explanation's algorithmic implications to learn generalizable entities to represent unseen classes, a crucial DML goal, remain unclear. To address this, we formulate GAP as a convex combination of learnable prototypes. We then show that the prototype learning can be expressed as a recursive process fitting a linear predictor to a batch of samples. Building on that perspective, we consider two batches of disjoint classes at each iteration and regularize the learning by expressing the samples of a batch with the prototypes that are fitted to the other batch. We validate our approach on 4 popular DML benchmarks.
翻译:全局平均池化(GAP)是深度度量学习(DML)中用于特征聚合的常用组件。其有效性通常归因于将每个特征向量视为独立的语义实体,而GAP则是这些实体的组合。尽管这一解释得到验证,但其对学习可泛化实体以表征未见类别(这是DML的关键目标)的算法含义仍不明确。为解决这一问题,我们将GAP形式化为可学习原型的凸组合。随后证明,原型学习可表述为一种递归过程——将线性预测器拟合至样本批次。基于这一视角,我们每次迭代考虑两个不重叠类别的批次,并通过用拟合至另一批次的样本来表达当前批次样本,从而对学习过程进行正则化。我们在4个主流DML基准数据集上验证了所提方法。