We address the problem of learning the dynamics of an unknown non-parametric system linking a target and a feature time series. The feature time series is measured on a sparse and irregular grid, while we have access to only a few points of the target time series. Once learned, we can use these dynamics to predict values of the target from the previous values of the feature time series. We frame this task as learning the solution map of a controlled differential equation (CDE). By leveraging the rich theory of signatures, we are able to cast this non-linear problem as a high-dimensional linear regression. We provide an oracle bound on the prediction error which exhibits explicit dependencies on the individual-specific sampling schemes. Our theoretical results are illustrated by simulations which show that our method outperforms existing algorithms for recovering the full time series while being computationally cheap. We conclude by demonstrating its potential on real-world epidemiological data.
翻译:我们研究了学习未知非参数系统动力学的问题,该系统连接目标时间序列与特征时间序列。特征时间序列在稀疏且不规则的网格上测量,而我们只能获取目标时间序列的少数观测点。一旦学习完成,我们可以利用这些动力学从特征时间序列的前期值预测目标值。我们将此任务建模为学习受控微分方程(CDE)的解映射。通过充分利用签名理论的丰富性质,我们能够将此非线性问题转化为高维线性回归问题。我们提供了预测误差的界限,该界限显式依赖于个体特定的采样方案。模拟实验表明,我们的方法在恢复完整时间序列方面优于现有算法,同时计算成本低廉。最后,我们通过真实世界的流行病学数据展示了其潜力。