While current generative models have achieved promising performances in time-series synthesis, they either make strong assumptions on the data format (e.g., regularities) or rely on pre-processing approaches (e.g., interpolations) to simplify the raw data. In this work, we consider a class of time series with three common bad properties, including sampling irregularities, missingness, and large feature-temporal dimensions, and introduce a general model, TS-Diffusion, to process such complex time series. Our model consists of three parts under the framework of point process. The first part is an encoder of the neural ordinary differential equation (ODE) that converts time series into dense representations, with the jump technique to capture sampling irregularities and self-attention mechanism to handle missing values; The second component of TS-Diffusion is a diffusion model that learns from the representation of time series. These time-series representations can have a complex distribution because of their high dimensions; The third part is a decoder of another ODE that generates time series with irregularities and missing values given their representations. We have conducted extensive experiments on multiple time-series datasets, demonstrating that TS-Diffusion achieves excellent results on both conventional and complex time series and significantly outperforms previous baselines.
翻译:当前生成模型在时间序列合成方面已取得显著进展,但这些方法或对数据格式做出强假设(如规律性),或依赖预处理方法(如插值)简化原始数据。本文针对具有三种常见不良特性的时间序列——采样不规则性、数据缺失性及特征-时间维度过高——提出通用模型TS-Diffusion。该模型基于点过程框架,由三部分组成:第一部分为神经常微分方程(ODE)编码器,通过跳跃技术捕捉采样不规则性并利用自注意力机制处理缺失值,将时间序列转化为稠密表征;第二部分为扩散模型,从时间序列表征中学习——这些表征因高维度而呈现复杂分布;第三部分为另一ODE解码器,根据表征生成包含不规则性与缺失值的时间序列。我们在多个时间序列数据集上开展广泛实验,结果表明TS-Diffusion在常规与复杂时间序列任务中均取得优异性能,显著超越现有基线方法。