In real-world scenarios like traffic and energy, massive time-series data with missing values and noises are widely observed, even sampled irregularly. While many imputation methods have been proposed, most of them work with a local horizon, which means models are trained by splitting the long sequence into batches of fit-sized patches. This local horizon can make models ignore global trends or periodic patterns. More importantly, almost all methods assume the observations are sampled at regular time stamps, and fail to handle complex irregular sampled time series arising from different applications. Thirdly, most existing methods are learned in an offline manner. Thus, it is not suitable for many applications with fast-arriving streaming data. To overcome these limitations, we propose BayOTIDE: Bayesian Online Multivariate Time series Imputation with functional decomposition. We treat the multivariate time series as the weighted combination of groups of low-rank temporal factors with different patterns. We apply a group of Gaussian Processes (GPs) with different kernels as functional priors to fit the factors. For computational efficiency, we further convert the GPs into a state-space prior by constructing an equivalent stochastic differential equation (SDE), and developing a scalable algorithm for online inference. The proposed method can not only handle imputation over arbitrary time stamps, but also offer uncertainty quantification and interpretability for the downstream application. We evaluate our method on both synthetic and real-world datasets.We release the code at {https://github.com/xuangu-fang/BayOTIDE}
翻译:在交通、能源等现实场景中,广泛存在着包含缺失值和噪声的大规模时间序列数据,甚至存在非规则采样的情况。尽管已有多种插补方法被提出,但大多数方法基于局部视野,即通过将长序列分割为适当大小的片段进行模型训练。这种局部视野可能导致模型忽略全局趋势或周期性规律。更重要的是,几乎所有方法均假设观测值在规则时间戳上采样,无法处理不同应用中出现的复杂非规则采样时间序列。第三,现有方法大多以离线方式学习,因此不适用于具有快速到达流数据的应用场景。为克服这些限制,我们提出BayOTIDE:基于函数分解的贝叶斯在线多元时间序列插补。我们将多元时间序列视为具有不同模式的低秩时间因子组的加权组合。我们采用具有不同核函数的高斯过程组作为函数先验来拟合这些因子。为提高计算效率,我们通过构建等效随机微分方程将高斯过程转换为状态空间先验,并开发了可扩展的在线推理算法。该方法不仅能处理任意时间戳上的插补任务,还能为下游应用提供不确定性量化和可解释性。我们在合成数据集和真实数据集上评估了所提方法。代码发布于{https://github.com/xuangu-fang/BayOTIDE}