A significant gap exists between theory and practice in deep learning. Generalization and approximation error bounds are often derived for simplified models or are too loose to be informative. Many rely on the manifold hypothesis and on geometric regularity such as intrinsic dimension, curvature, and reach. Progress requires insight into data-manifold geometry and suitable benchmarks, yet existing options are polarized: analytic manifolds with known geometry but limited applicability, or real-world datasets where geometry is only coarsely estimable. We introduce a benchmarking framework for studying data geometry. We repurpose and extend dSprites and COIL-20 with additional transformation dimensions and dense, axis-aligned sampling, and pair them with finite-difference estimators that recover curvature, reach, and volume at near-ground-truth accuracy in a regime where general-purpose estimators are unreliable or difficult to deploy. The framework is intended as a controlled testbed, useful as a calibration environment for geometric estimators and a sandbox for probing theoretical assumptions. To illustrate its use, we present two application studies, namely assessing the scaling behavior of the bounds of Genovese et al. and Fefferman et al., and tracking the layer-wise geometry of a $β$-VAE, highlighting the behavior of current bounds and the value of controlled benchmarks for guiding and validating future theory. A reference implementation is available at https://github.com/koulakis/manifold-microscope.
翻译:深度学习中理论与实践之间存在显著鸿沟。泛化与近似误差边界通常针对简化模型推导,或过于松散而缺乏实用性。许多研究依赖于流形假设及其几何正则性,如本征维数、曲率和reach值。理论进展需要洞察数据流形的几何结构及合适的基准,但现有选择呈现两极分化:几何结构已知但适用性有限的分析流形,或几何特性仅能粗略估计的真实数据集。我们提出一个研究数据几何结构的基准测试框架。通过扩展dSprites和COIL-20数据集,增加额外变换维度与密集的轴对齐采样,并配备有限差分估计器,在通用估计器不可靠或难以部署的场景下,能以接近真实值的精度恢复曲率、reach值和体积。该框架旨在作为受控测试平台,既可用于几何估计器的校准环境,也可作为检验理论假设的沙盒。为示范其应用,我们开展两项案例研究:评估Genovese等人与Fefferman等人提出的边界缩放行为,以及追踪β-VAE的逐层几何结构,揭示现有边界的性能特征及受控基准对指导验证未来理论的价值。参考实现见https://github.com/koulakis/manifold-microscope。