Contrastive learning, along with its variations, has been a highly effective self-supervised learning method across diverse domains. Contrastive learning measures the distance between representations using cosine similarity and uses cross-entropy for representation learning. Within the same framework of cosine-similarity-based representation learning, margins have played a significant role in enhancing face and speaker recognition tasks. Interestingly, despite the shared reliance on the same similarity metrics and objective functions, contrastive learning has not actively adopted margins. Furthermore, decision-boundary-based explanations are the only ones that have been used to explain the effect of margins in contrastive learning. In this work, we propose a new perspective to understand the role of margins based on gradient analysis. Based on the new perspective, we analyze how margins affect gradients of contrastive learning and separate the effect into more elemental levels. We separately analyze each and provide possible directions for improving contrastive learning. Our experimental results demonstrate that emphasizing positive samples and scaling gradients depending on positive sample angles and logits are the keys to improving the generalization performance of contrastive learning in both seen and unseen datasets, and other factors can only marginally improve performance.
翻译:对比学习及其变体已成为跨领域高效的自监督学习方法。对比学习通过余弦相似度衡量表征之间的距离,并采用交叉熵进行表征学习。在基于余弦相似度的表征学习框架下,边际在人脸识别和说话人识别任务中发挥了重要作用。有趣的是,尽管共享相同的相似度度量与目标函数,对比学习却未主动采用边际方法。此外,目前仅存在基于决策边界的解释来说明边际在对比学习中的作用。本文提出基于梯度分析的新视角来理解边际的作用机制。基于该新视角,我们分析了边际如何影响对比学习的梯度,并将影响分解至更基础的层级。我们分别对各层级进行分析,并提出了改进对比学习的可能方向。实验结果表明,强调正样本并根据正样本角度与logits进行梯度缩放,是提升对比学习在已知与未知数据集上泛化性能的关键,而其他因素仅能带来边际性能提升。