The aim of this paper is to investigate the connection between learning trajectories of the Deep Neural Networks (DNNs) and their corresponding generalization capabilities when being optimized with broadly used gradient descent and stochastic gradient descent algorithms. In this paper, we construct Linear Approximation Function to model the trajectory information and we propose a new generalization bound with richer trajectory information based on it. Our proposed generalization bound relies on the complexity of learning trajectory and the ratio between the bias and diversity of training set. Experimental results indicate that the proposed method effectively captures the generalization trend across various training steps, learning rates, and label noise levels.
翻译:本文旨在研究深度神经网络在使用广泛采用的梯度下降和随机梯度下降算法进行优化时,其学习轨迹与相应泛化能力之间的关联。我们构建了线性近似函数来建模轨迹信息,并基于此提出了一种融合更丰富轨迹信息的新型泛化界限。该泛化界限依赖于学习轨迹的复杂度以及训练集的偏差与多样性之比。实验结果表明,所提方法能够有效捕捉不同训练步数、学习率和标签噪声水平下的泛化趋势。