We present a novel objective function for cluster-based self-supervised learning (SSL) that is designed to circumvent the triad of failure modes, namely representation collapse, cluster collapse, and the problem of invariance to permutations of cluster assignments. This objective consists of three key components: (i) A generative term that penalizes representation collapse, (ii) a term that promotes invariance to data augmentations, thereby addressing the issue of label permutations and (ii) a uniformity term that penalizes cluster collapse. Additionally, our proposed objective possesses two notable advantages. Firstly, it can be interpreted from a Bayesian perspective as a lower bound on the data log-likelihood. Secondly, it enables the training of a standard backbone architecture without the need for asymmetric elements like stop gradients, momentum encoders, or specialized clustering layers. Due to its simplicity and theoretical foundation, our proposed objective is well-suited for optimization. Experiments on both toy and real world data demonstrate its effectiveness
翻译:我们提出了一种新颖的基于聚类的自监督学习(SSL)目标函数,该函数旨在规避三种失败模式,即表示崩溃、聚类崩溃以及聚类分配排列不变性问题。该目标函数包含三个关键组成部分:(i)惩罚表示崩溃的生成项;(ii)促进数据增强不变性,从而解决标签排列问题的项;以及(iii)惩罚聚类崩溃的均匀项。此外,我们提出的目标函数具有两个显著优势。首先,它可以从贝叶斯角度解释为数据对数似然的下界。其次,它无需停止梯度、动量编码器或专用聚类层等非对称元素,即可训练标准骨干架构。由于简单性和理论基础,我们提出的目标函数非常适合优化。在玩具数据和真实世界数据上的实验证明了其有效性。