We consider a supervised learning setup in which the goal is to predicts an outcome from a sample of irregularly sampled time series using Neural Controlled Differential Equations (Kidger, Morrill, et al. 2020). In our framework, the time series is a discretization of an unobserved continuous path, and the outcome depends on this path through a controlled differential equation with unknown vector field. Learning with discrete data thus induces a discretization bias, which we precisely quantify. Using theoretical results on the continuity of the flow of controlled differential equations, we show that the approximation bias is directly related to the approximation error of a Lipschitz function defining the generative model by a shallow neural network. By combining these result with recent work linking the Lipschitz constant of neural networks to their generalization capacities, we upper bound the generalization gap between the expected loss attained by the empirical risk minimizer and the expected loss of the true predictor.
翻译:我们考虑一个监督学习场景,其目标是通过神经控制微分方程(Kidger, Morrill 等, 2020)从不规则采样时间序列样本中预测结果。在我们的框架中,时间序列是未观测连续路径的离散化,而结果通过一个具有未知向量场的控制微分方程依赖于该路径。因此,基于离散数据的学习会产生离散化偏差,我们对此偏差进行了精确定量刻画。利用控制微分方程流的连续性理论结果,我们证明近似偏差直接与浅层神经网络对定义生成模型的Lipschitz函数的逼近误差相关。通过将这些结果与近期关于神经网络Lipschitz常数与其泛化能力关联的研究相结合,我们给出了经验风险最小化器所获期望损失与真实预测器期望损失之间泛化差距的上界。