While raw cosine similarity in pretrained embedding spaces exhibits strong rank correlation with human judgments, anisotropy induces systematic miscalibration of absolute values: scores concentrate in a narrow high-similarity band regardless of actual semantic relatedness, limiting interpretability as a quantitative measure. Prior work addresses this by modifying the embedding space (whitening, contrastive fine tuning), but such transformations alter geometric structure and require recomputing all embeddings. Using isotonic regression trained on human similarity judgments, we construct a monotonic transformation that achieves near-perfect calibration while preserving rank correlation and local stability(98% across seven perturbation types). Our contribution is not to replace cosine similarity, but to restore interpretability of its absolute values through monotone calibration, without altering its ranking properties. We characterize isotonic calibration as an order-preserving reparameterization and prove that all order-based constructions (angular ordering, nearest neighbors, threshold graphs and quantile-based decisions) are invariant under this transformation.
翻译:尽管预训练嵌入空间中的原始余弦相似度与人类判断展现出强烈的秩相关性,但各向异性会导致绝对值的系统性误校准:无论实际语义相关性如何,相似度分数都集中在狭窄的高相似度区间内,这限制了其作为定量度量的可解释性。先前研究通过修改嵌入空间(白化处理、对比微调)来解决此问题,但此类变换会改变几何结构并需要重新计算所有嵌入向量。利用基于人类相似度判断训练的等渗回归,我们构建了一种单调变换,在保持秩相关性和局部稳定性(在七种扰动类型中达到98%)的同时实现了近乎完美的校准。我们的贡献并非取代余弦相似度,而是通过单调校准恢复其绝对值的可解释性,且不改变其排序特性。我们将等渗校准表征为保序重参数化过程,并证明所有基于序关系的构造(角度排序、最近邻检索、阈值图及基于分位数的决策)在此变换下均保持不变。