We address the problem of learning the dynamics of an unknown non-parametric system linking a target and a feature time series. The feature time series is measured on a sparse and irregular grid, while we have access to only a few points of the target time series. Once learned, we can use these dynamics to predict values of the target from the previous values of the feature time series. We frame this task as learning the solution map of a controlled differential equation (CDE). By leveraging the rich theory of signatures, we are able to cast this non-linear problem as a high-dimensional linear regression. We provide an oracle bound on the prediction error which exhibits explicit dependencies on the individual-specific sampling schemes. Our theoretical results are illustrated by simulations which show that our method outperforms existing algorithms for recovering the full time series while being computationally cheap. We conclude by demonstrating its potential on real-world epidemiological data.
翻译:我们研究了学习未知非参数系统动力学的问题,该系统连接了目标时间序列与特征时间序列。特征时间序列在稀疏且不规则的网格上测量,而目标时间序列仅有少量观测点可供使用。一旦学习完成,即可利用这些动力学从特征时间序列的先前值预测目标值。我们将此任务构建为学习控制微分方程(CDE)的解映射。通过利用签名理论的丰富成果,我们能够将这一非线性问题转化为高维线性回归问题。我们给出了预测误差的oracle界,该界明确依赖于个体特定的采样方案。仿真结果验证了我们的理论:与现有算法相比,本方法能以较低的计算成本更优地恢复完整时间序列。最后,我们通过真实流行病学数据展示了其应用潜力。