Evaluating representation similarity is fundamental to representation learning. However, existing metrics suffer from significant limitations: they lack interpretability due to shifting baselines, lack robustness to outliers, and are computationally intractable for large datasets, forcing reliance on heuristic approximations. To address this, we develop an ordinal-similarity framework, instantiated by the Triplet (TSI) and Quadruplet (QSI) Similarity Indices, which measure alignment by quantifying the consistency of ordinal relationships. We theoretically demonstrate this formulation is inherently interpretable, robust to outliers, and computationally efficient. Finally, we establish a formal equivalence between TSI and local neighborhood alignment, measured by Mutual Nearest Neighbors. Empirically, we validate these properties and show that ordinal similarity offers a scalable approach to measuring alignment, enabling practitioners to better understand and design representations.
翻译:评估表征相似性是表征学习的核心问题。然而,现有度量方法存在显著局限性:由于基线漂移而缺乏可解释性、对异常值缺乏鲁棒性,且在大规模数据集上计算复杂度高,迫使研究者依赖启发式近似。为解决这一问题,我们提出了序数相似性框架,通过三元组相似性指数(TSI)和四元组相似性指数(QSI)加以实现,通过量化序数关系的一致性来度量对齐程度。我们从理论上证明了该公式具有内在可解释性、对异常值鲁棒且计算高效。最后,我们建立了TSI与通过互近邻度量的局部邻域对齐之间的形式化等价关系。实验验证了这些特性,并表明序数相似性为度量对齐提供了一种可扩展的方法,使研究者能够更好地理解和设计表征。