Lately, there has been a surge in interest surrounding generative modeling of time series data. Most existing approaches are designed either to process short sequences or to handle long-range sequences. This dichotomy can be attributed to gradient issues with recurrent networks, computational costs associated with transformers, and limited expressiveness of state space models. Towards a unified generative model for varying-length time series, we propose in this work to transform sequences into images. By employing invertible transforms such as the delay embedding and the short-time Fourier transform, we unlock three main advantages: i) We can exploit advanced diffusion vision models; ii) We can remarkably process short- and long-range inputs within the same framework; and iii) We can harness recent and established tools proposed in the time series to image literature. We validate the effectiveness of our method through a comprehensive evaluation across multiple tasks, including unconditional generation, interpolation, and extrapolation. We show that our approach achieves consistently state-of-the-art results against strong baselines. In the unconditional generation tasks, we show remarkable mean improvements of 58.17% over previous diffusion models in the short discriminative score and 132.61% in the (ultra-)long classification scores. Code is at https://github.com/azencot-group/ImagenTime.
翻译:近来,时间序列数据的生成建模研究兴趣激增。现有方法大多专为处理短序列或长程序列而设计,这种二分法可归因于循环网络的梯度问题、Transformer的计算成本以及状态空间模型表达能力有限。为实现变长时序数据的统一生成模型,本研究提出将序列转换为图像。通过采用延迟嵌入和短时傅里叶变换等可逆变换,我们实现了三大优势:i) 能够利用先进的视觉扩散模型;ii) 可在同一框架内显著处理短程与长程输入;iii) 可整合时序-图像转换领域最新及经典工具。我们在无条件生成、插值与外推等多任务中通过系统评估验证了方法的有效性,结果表明本方法在强基线对比中持续取得最先进的性能。在无条件生成任务中,本方法在短序列判别分数上较先前扩散模型显著提升58.17%,在(超)长序列分类分数上提升132.61%。代码发布于 https://github.com/azencot-group/ImagenTime。