High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum -- i.e. the dimension of the submanifold it belongs to -- is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven useful in contexts ranging from generalization in neural networks to detection of out-of-distribution data, adversarial examples, and AI-generated text. The recent successes of deep generative models present an opportunity to leverage them for LID estimation, but current methods based on generative models produce inaccurate estimates, require more than a single pre-trained model, are computationally intensive, or do not exploit the best available deep generative models, i.e. diffusion models (DMs). In this work, we show that the Fokker-Planck equation associated with a DM can provide a LID estimator which addresses all the aforementioned deficiencies. Our estimator, called FLIPD, is compatible with all popular DMs, and outperforms existing baselines on LID estimation benchmarks. We also apply FLIPD on natural images where the true LID is unknown. Compared to competing estimators, FLIPD exhibits a higher correlation with non-LID measures of complexity, better matches a qualitative assessment of complexity, and is the only estimator to remain tractable with high-resolution images at the scale of Stable Diffusion.
翻译:高维数据通常位于低维子流形上,而估计数据点的局部本征维数——即其所属子流形的维度——是一个长期存在的问题。LID 可理解为局部变异因子的数量:数据点拥有的变异因子越多,其复杂性往往越高。估计这一量值已被证明在神经网络泛化、分布外数据检测、对抗样本识别以及 AI 生成文本检测等场景中具有重要价值。深度生成模型近年来的成功为利用其进行 LID 估计提供了契机,但现有基于生成模型的方法存在估计不准确、需要多个预训练模型、计算成本高昂或未能充分利用当前最优的深度生成模型(即扩散模型)等局限。本研究证明,与扩散模型相关联的福克-普朗克方程可推导出一种能够解决上述所有缺陷的 LID 估计器。我们提出的 FLIPD 估计器兼容所有主流扩散模型,在 LID 估计基准测试中优于现有基线方法。我们还将 FLIPD 应用于真实 LID 未知的自然图像场景。相较于其他估计器,FLIPD 表现出与非 LID 复杂度度量指标更高的相关性,更符合复杂度的定性评估结果,并且是唯一能在 Stable Diffusion 尺度的高分辨率图像上保持可计算性的估计器。