Contrastive learning (CL) continuously achieves significant breakthroughs across multiple domains. However, the most common InfoNCE based methods suffer from some existing dilemmas, e.g., uniformity-tolerance dilemma (UTD) and the gradient reduction. It has been identified that UTD can lead to unexpected performance degradation. We argue that the fixity of temperature is to blame for UTD. To tackle this challenge, we enrich the CL loss family by presenting a Model-Aware Contrastive Learning (MACL) strategy, whose temperature is adaptive to the magnitude of alignment that reflects the basic confidence of the instance discrimination task, then enables CL loss to adjust the penalty strength for hard negatives adaptively. Regarding another dilemma, the gradient reduction issue, we derive the limits of an involved gradient scaling factor, which allows us to explain from a unified perspective why some recent approaches are effective with fewer negative samples, and summarily present a gradient reweighting to escape this dilemma. Extensive remarkable empirical results in vision, sentence, and graph modality validate our approach's general improvement for representation learning and downstream tasks.
翻译:对比学习(Contrastive Learning, CL)在多个领域持续取得重大突破。然而,最常用的基于InfoNCE的方法存在一些固有困境,例如一致性-容忍性困境(Uniformity-Tolerance Dilemma, UTD)和梯度衰减问题。研究表明,UTD会导致意外的性能下降。我们认为,温度参数的固定性是导致UTD的主要原因。为解决这一挑战,我们提出了一种模型感知对比学习(Model-Aware Contrastive Learning, MACL)策略,该策略中的温度参数会根据对齐程度(反映实例判别任务的基本置信度)进行自适应调整,从而使对比损失能够自适应地调整对困难负样本的惩罚强度。针对另一个困境——梯度衰减问题,我们推导了所涉及的梯度缩放因子的极限,这使我们能够从统一视角解释为何近期一些方法在使用较少负样本时仍能有效,并据此提出一种梯度重加权方法来摆脱这一困境。在视觉、句子和图模态上广泛且显著的实验结果表明,我们的方法在表示学习及下游任务中具有普遍的改进效果。