Bounding and predicting the generalization gap of overparameterized neural networks remains a central open problem in theoretical machine learning. There is a recent and growing body of literature that proposes the framework of fractals to model optimization trajectories of neural networks, motivating generalization bounds and measures based on the fractal dimension of the trajectory. Notably, the persistent homology dimension has been proposed to correlate with the generalization gap. This paper performs an empirical evaluation of these persistent homology-based generalization measures, with an in-depth statistical analysis. Our study reveals confounding effects in the observed correlation between generalization and topological measures due to the variation of hyperparameters. We also observe that fractal dimension fails to predict generalization of models trained from poor initializations. We lastly reveal the intriguing manifestation of model-wise double descent in these topological generalization measures. Our work forms a basis for a deeper investigation of the causal relationships between fractal geometry, topological data analysis, and neural network optimization.
翻译:过参数化神经网络的泛化间隙界定与预测仍然是理论机器学习领域的核心开放性问题。近期涌现出越来越多文献提出采用分形框架对神经网络的优化轨迹进行建模,从而基于轨迹的分形维数推导泛化界并构建度量指标。值得注意的是,已有研究提出持久同调维数与泛化间隙存在相关性。本文通过深入的统计分析,对这些基于持久同调的泛化度量指标进行实证评估。我们的研究发现,由于超参数变化产生的混杂效应,导致观测到的泛化性能与拓扑度量之间的相关性存在干扰。同时我们观察到,对于从较差初始化开始训练的模型,分形维数无法有效预测其泛化能力。最后,我们揭示了这些拓扑泛化度量指标中呈现出的模型层面双重下降现象的有趣表征。本研究为深入探究分形几何、拓扑数据分析与神经网络优化之间的因果关系奠定了理论基础。