The Platonic Representation Hypothesis claims that recent foundation models are converging to a shared representation space as a function of their downstream task performance, irrespective of the objectives and data modalities used to train these models. Representational similarity is generally measured for individual datasets and is not necessarily consistent across datasets. Thus, one may wonder whether this convergence of model representations is confounded by the datasets commonly used in machine learning. Here, we propose a systematic way to measure how representational similarity between models varies with the set of stimuli used to construct the representations. We find that the objective function is the most crucial factor in determining the consistency of representational similarities across datasets. Specifically, self-supervised vision models learn representations whose relative pairwise similarities generalize better from one dataset to another compared to those of image classification or image-text models. Moreover, the correspondence between representational similarities and the models' task behavior is dataset-dependent, being most strongly pronounced for single-domain datasets. Our work provides a framework for systematically measuring similarities of model representations across datasets and linking those similarities to differences in task behavior.
翻译:柏拉图表征假说声称,随着下游任务性能的提升,近期的基础模型正收敛到一个共享的表征空间,而与训练这些模型所使用的目标函数和数据模态无关。表征相似性通常针对单个数据集进行测量,且在不同数据集之间未必保持一致。因此,人们可能会质疑这种模型表征的收敛是否受到机器学习中常用数据集的干扰。在此,我们提出一种系统化方法,用于测量模型间的表征相似性如何随用于构建表征的刺激集而变化。我们发现,目标函数是决定跨数据集表征相似性一致性的最关键因素。具体而言,与图像分类或图文模型相比,自监督视觉模型学习到的表征,其相对成对相似性从一个数据集到另一个数据集的泛化能力更强。此外,表征相似性与模型任务行为之间的对应关系具有数据集依赖性,在单领域数据集中表现得最为显著。我们的工作提供了一个框架,用于系统化地测量模型表征在不同数据集间的相似性,并将这些相似性与任务行为的差异联系起来。