Image representations (artificial or biological) are often compared in terms of their global geometry; however, representations with similar global structure can have strikingly different local geometries. Here, we propose a framework for comparing a set of image representations in terms of their local geometries. We quantify the local geometry of a representation using the Fisher information matrix, a standard statistical tool for characterizing the sensitivity to local stimulus distortions, and use this as a substrate for a metric on the local geometry in the vicinity of a base image. This metric may then be used to optimally differentiate a set of models, by finding a pair of "principal distortions" that maximize the variance of the models under this metric. We use this framework to compare a set of simple models of the early visual system, identifying a novel set of image distortions that allow immediate comparison of the models by visual inspection. In a second example, we apply our method to a set of deep neural network models and reveal differences in the local geometry that arise due to architecture and training types. These examples highlight how our framework can be used to probe for informative differences in local sensitivities between complex computational models, and suggest how it could be used to compare model representations with human perception.
翻译:图像表示(人工或生物)通常通过其全局几何结构进行比较;然而,具有相似全局结构的表示可能具有截然不同的局部几何特性。在此,我们提出一个框架,用于从局部几何角度比较一组图像表示。我们使用费舍尔信息矩阵量化表示的局部几何特性,该矩阵是表征对局部刺激畸变敏感度的标准统计工具,并以此作为基础图像邻域内局部几何度量的基础。随后,该度量可用于最优区分一组模型,方法是找到一对“主畸变”,在此度量下最大化模型的方差。我们运用该框架比较了一组早期视觉系统的简单模型,识别出一组新颖的图像畸变,使得通过视觉检查即可直接比较模型。在第二个示例中,我们将该方法应用于一组深度神经网络模型,揭示了因架构和训练类型差异而产生的局部几何特性区别。这些示例凸显了我们的框架如何用于探究复杂计算模型间局部敏感性的信息差异,并展示了其如何用于比较模型表示与人类感知。