Neural ordinary differential equations (neural ODEs) are a popular family of continuous-depth deep learning models. In this work, we consider a large family of parameterized ODEs with continuous-in-time parameters, which include time-dependent neural ODEs. We derive a generalization bound for this class by a Lipschitz-based argument. By leveraging the analogy between neural ODEs and deep residual networks, our approach yields in particular a generalization bound for a class of deep residual networks. The bound involves the magnitude of the difference between successive weight matrices. We illustrate numerically how this quantity affects the generalization capability of neural networks.
翻译:神经普通微分方程(神经ODE)是一类流行的连续深度深度学习模型。本文考虑一类包含时变参数的连续时间参数化ODE族,其中包括时间依赖的神经ODE。我们通过基于Lipschitz条件的论证,推导出该类别的一个泛化界。利用神经ODE与深度残差网络之间的类比关系,我们的方法特别地为深度残差网络的一类模型提供了泛化界。该界涉及连续权重矩阵之间差异的幅度。我们通过数值实验说明了这一量如何影响神经网络的泛化能力。