Self-Supervised Learning (SSL) methods such as VICReg, Barlow Twins or W-MSE avoid collapse of their joint embedding architectures by constraining or regularizing the covariance matrix of their projector's output. This study highlights important properties of such strategy, which we coin Variance-Covariance regularization (VCReg). More precisely, we show that {\em VCReg combined to a MLP projector enforces pairwise independence between the features of the learned representation}. This result emerges by bridging VCReg applied on the projector's output to kernel independence criteria applied on the projector's input. We empirically validate our findings where (i) we put in evidence which projector's characteristics favor pairwise independence, (ii) we demonstrate pairwise independence to be beneficial for out-of-domain generalization, (iii) we demonstrate that the scope of VCReg goes beyond SSL by using it to solve Independent Component Analysis. This provides the first theoretical motivation and explanation of MLP projectors in SSL.
翻译:自监督学习方法(如VICReg、Barlow Twins或W-MSE)通过约束或正则化其投影器输出端的协方差矩阵,避免了联合嵌入架构的崩溃。本研究揭示了此类策略(我们称之为方差-协方差正则化,VCReg)的重要性质。更精确地,我们证明了**将VCReg与MLP投影器相结合,能够强制学习到的表示特征之间实现成对独立性**。这一结果通过将应用于投影器输出端的VCReg与应用于投影器输入端的核独立性准则建立联系而得出。我们通过实验验证了以下发现:(i)指出有利于成对独立性的投影器特征;(ii)证明成对独立性有益于域外泛化;(iii)证明VCReg的应用范围超越了自监督学习,可用于解决独立成分分析问题。这为自监督学习中的MLP投影器提供了首个理论动机与解释。