InfoNCE is the standard contrastive learning objective, but its softmax form is not only a computational convenience: it also encodes a statistical assumption about how the top-scoring example is selected. Using extreme value theory, we show that this assumption is often misaligned with the normalized embedding setting used in modern contrastive learning. Motivated by this mismatch, we propose \textsc{WEINCE}, a simple modification of InfoNCE that uses anchor-wise online batch statistics to blend the usual softmax logits with an endpoint shortfall correction, adding no trainable parameters. Across five vision benchmarks, \textsc{WEINCE} yields consistent improvements in frozen-feature evaluation. These results show that a more faithful statistical treatment of hard negatives can improve contrastive objectives.
翻译:InfoNCE是标准对比学习目标函数,但其softmax形式不仅是一种计算便利:它还编码了关于如何选择最高得分样本的统计假设。利用极值理论,我们证明这一假设通常与现代对比学习中使用的归一化嵌入设置不匹配。受此偏差启发,我们提出\textsc{WEINCE}——对InfoNCE的简单修改,利用锚点方向的在线批量统计将常规softmax逻辑值与端点短缺修正相融合,且不增加可训练参数。在五个视觉基准测试中,\textsc{WEINCE}在冻结特征评估中取得一致改进。这些结果表明,对困难负样本进行更准确的统计处理能够提升对比学习目标函数的性能。