Generative model evaluation commonly relies on high-dimensional embedding spaces to compute distances between samples. We show that dataset representations in these spaces are affected by the hubness phenomenon, which distorts nearest-neighbor relationships and biases distance-based metrics. Building on the classical Iterative Contextual Dissimilarity Measure (ICDM), we introduce Generative ICDM (GICDM), a method to correct neighborhood estimation for both real and generated data. We introduce a multi-scale extension to improve empirical behavior. Extensive experiments on synthetic and real benchmarks demonstrate that GICDM resolves hubness-induced failures, restores reliable metric behavior, and improves alignment with human assessment.
翻译:生成模型评估通常依赖高维嵌入空间来计算样本间的距离。我们证明,这些空间中的数据集表征受到中心性现象的影响,该现象会扭曲最近邻关系并导致基于距离的评价指标产生偏差。基于经典的迭代上下文差异度量(ICDM),我们提出了生成式ICDM(GICDM),一种用于校正真实数据与生成数据邻域估计的方法。我们引入了多尺度扩展以改善经验表现。在合成与真实基准上的大量实验表明,GICDM能解决由中心性引起的失效问题,恢复可靠的度量行为,并提升与人类评估的一致性。