It has been observed that representations learned by distinct neural networks conceal structural similarities when the models are trained under similar inductive biases. From a geometric perspective, identifying the classes of transformations and the related invariances that connect these representations is fundamental to unlocking applications, such as merging, stitching, and reusing different neural modules. However, estimating task-specific transformations a priori can be challenging and expensive due to several factors (e.g., weights initialization, training hyperparameters, or data modality). To this end, we introduce a versatile method to directly incorporate a set of invariances into the representations, constructing a product space of invariant components on top of the latent representations without requiring prior knowledge about the optimal invariance to infuse. We validate our solution on classification and reconstruction tasks, observing consistent latent similarity and downstream performance improvements in a zero-shot stitching setting. The experimental analysis comprises three modalities (vision, text, and graphs), twelve pretrained foundational models, nine benchmarks, and several architectures trained from scratch.
翻译:已有研究表明,当不同神经网络在相似归纳偏向下训练时,其学习到的表征会隐藏结构相似性。从几何角度出发,识别连接这些表征的变换类别及其相关不变性,对于解锁如合并、拼接和重用不同神经模块等应用至关重要。然而,由于权重初始化、训练超参数或数据模态等因素,先验地估计任务特定变换可能具有挑战性且代价高昂。为此,我们提出了一种通用方法,直接将一组不变性嵌入表征中,在潜在表征之上构建不变分量的乘积空间,而无需预先了解需要注入的最佳不变性。我们在分类和重建任务上验证了该方案,在零样本拼接设置中观察到一致的表征相似性和下游性能提升。实验分析涵盖三种模态(视觉、文本和图)、十二个预训练基础模型、九个基准测试以及多种从头训练的架构。