The Platonic Representation Hypothesis suggests that representations from neural networks are converging to a common statistical model of reality. We show that the existing metrics used to measure representational similarity are confounded by network scale: increasing model depth or width can systematically inflate representational similarity scores. To correct these effects, we introduce a permutation-based null-calibration framework that transforms any representational similarity metric into a calibrated score with statistical guarantees. We revisit the Platonic Representation Hypothesis with our calibration framework, which reveals a nuanced picture: the apparent convergence reported by global spectral measures largely disappears after calibration, while local neighborhood similarity, but not local distances, retains significant agreement across different modalities. Based on these findings, we propose the Aristotelian Representation Hypothesis: representations in neural networks are converging to shared local neighborhood relationships.
翻译:柏拉图式表征假假设认为,神经网络的表征正在收敛于一个共同的现实统计模型。我们发现,现有用于衡量表征相似性的度量指标受到网络规模的混淆:增加模型深度或宽度会系统性抬高表征相似性分数。为纠正这些影响,我们引入了一种基于置换的零校准框架,该框架可将任何表征相似性度量转换为具有统计保证的校准分数。我们使用校准框架重新审视柏拉图式表征假说,揭示了一个细致入微的图景:全局谱度量所报告的明显收敛现象在校准后基本消失,而局部邻域相似性(而非局部距离)在不同模态间仍保持显著一致性。基于这些发现,我们提出亚里士多德式表征假说:神经网络中的表征正在收敛于共享的局部邻域关系。