Contrastive learning has become a cornerstone of modern representation learning, allowing training with massive unlabeled data for both task-specific and general (foundation) models. A prototypical loss in contrastive training is InfoNCE and its variants. In this work, we show that the InfoNCE objective induces Gaussian structure in representations that emerge from contrastive training. We establish this result in two complementary regimes. First, we show that under certain alignment and concentration assumptions, projections of the high-dimensional representation asymptotically approach a multivariate Gaussian distribution. Next, under less strict assumptions, we show that adding a small asymptotically vanishing regularization term that promotes low feature norm and high feature entropy leads to similar asymptotic results. We support our analysis with experiments on synthetic and CIFAR-10 datasets across multiple encoder architectures and sizes, demonstrating consistent Gaussian behavior. This perspective provides a principled explanation for commonly observed Gaussianity in contrastive representations. The resulting Gaussian model enables principled analytical treatment of learned representations and is expected to support a wide range of applications in contrastive learning.
翻译:对比学习已成为现代表示学习的基石,使得利用海量无标签数据训练任务特定模型和通用(基础)模型成为可能。对比训练中的典型损失函数是InfoNCE及其变体。在本工作中,我们证明InfoNCE目标会诱导对比训练产生的表示中形成高斯结构。我们从两个互补的角度建立这一结论。首先,我们证明在特定的对齐性和集中性假设下,高维表示的投影渐近趋近于多元高斯分布。其次,在较宽松的假设下,我们证明加入一个促进低特征范数和高特征熵的小幅渐近消失正则化项,也能得到类似的渐近结果。我们通过合成数据集和CIFAR-10数据集上多种编码器架构与规模的实验验证了分析结果,表明高斯行为具有一致性。这一视角为对比表示中普遍观察到的高斯性提供了原理性解释。所得的高斯模型能够对学习到的表示进行原理性分析处理,有望支持对比学习中的广泛实际应用。