This paper proposes the Hierarchical Functional Maximal Correlation Algorithm (HFMCA), a hierarchical methodology that characterizes dependencies across two hierarchical levels in multiview systems. By framing view similarities as dependencies and ensuring contrastivity by imposing orthonormality, HFMCA achieves faster convergence and increased stability in self-supervised learning. HFMCA defines and measures dependencies within image hierarchies, from pixels and patches to full images. We find that the network topology for approximating orthonormal basis functions aligns with a vanilla CNN, enabling the decomposition of density ratios between neighboring layers of feature maps. This approach provides powerful interpretability, revealing the resemblance between supervision and self-supervision through the lens of internal representations.
翻译:本文提出层级函数最大相关性算法(HFMCA),这是一种在多视角系统中刻画两个层级间依赖关系的层级化方法。通过将视角相似性建模为依赖关系,并采用正交归一化约束确保对比性,HFMCA在自监督学习中实现了更快的收敛速度与更高的稳定性。该算法定义并度量了图像层级内部的依赖关系,涵盖从像素、图像块到完整图像的各个层次。研究发现,用于逼近正交基函数的网络拓扑结构与标准卷积神经网络(CNN)保持一致,从而支持对相邻特征图层密度比率进行分解。该方法提供了强大的可解释性,通过内部表征的视角揭示了监督学习与自监督学习之间的相似性。