We consider a supervised learning setup in which the goal is to predicts an outcome from a sample of irregularly sampled time series using Neural Controlled Differential Equations (Kidger, Morrill, et al. 2020). In our framework, the time series is a discretization of an unobserved continuous path, and the outcome depends on this path through a controlled differential equation with unknown vector field. Learning with discrete data thus induces a discretization bias, which we precisely quantify. Using theoretical results on the continuity of the flow of controlled differential equations, we show that the approximation bias is directly related to the approximation error of a Lipschitz function defining the generative model by a shallow neural network. By combining these result with recent work linking the Lipschitz constant of neural networks to their generalization capacities, we upper bound the generalization gap between the expected loss attained by the empirical risk minimizer and the expected loss of the true predictor.
翻译:我们考虑一种监督学习框架,其目标是通过神经控制微分方程(Kidger, Morrill 等,2020)从不规则采样时间序列样本中预测结果。在该框架中,时间序列是未观测连续路径的离散化形式,而结果则通过一个具有未知向量场的控制微分方程依赖于该路径。因此,基于离散数据的学习会产生离散化偏差,我们对其进行了精确量化。利用控制微分方程流的连续性理论结果,我们证明了近似偏差与浅层神经网络对生成模型定义的李普希茨函数的逼近误差直接相关。通过将这些结果与近期关于神经网络李普希茨常数与其泛化能力关联的研究相结合,我们给出了经验风险最小化器所达到的期望损失与真实预测器期望损失之间的泛化差距上界。