The generation of high-quality, long-sequenced time-series data is essential due to its wide range of applications. In the past, standalone Recurrent and Convolutional Neural Network-based Generative Adversarial Networks (GAN) were used to synthesize time-series data. However, they are inadequate for generating long sequences of time-series data due to limitations in the architecture. Furthermore, GANs are well known for their training instability and mode collapse problem. To address this, we propose TransFusion, a diffusion, and transformers-based generative model to generate high-quality long-sequence time-series data. We have stretched the sequence length to 384, and generated high-quality synthetic data. To the best of our knowledge, this is the first study that has been done with this long-sequence length. Also, we introduce two evaluation metrics to evaluate the quality of the synthetic data as well as its predictive characteristics. We evaluate TransFusion with a wide variety of visual and empirical metrics, and TransFusion outperforms the previous state-of-the-art by a significant margin.
翻译:生成高质量、长序列的时间序列数据因其广泛的应用场景而至关重要。过去,基于独立循环神经网络和卷积神经网络的生成对抗网络(GAN)被用于合成时间序列数据。然而,由于架构上的局限性,这些方法难以生成长时间序列数据。此外,GAN以其训练不稳定和模式崩溃问题而闻名。为解决这一问题,我们提出TransFusion——一种基于扩散模型与Transformer的生成模型,用于生成高质量的长序列时间序列数据。我们将序列长度扩展至384,并生成了高质量的合成数据。据我们所知,这是首个针对此序列长度的研究。同时,我们引入了两种评估指标,用于衡量合成数据的质量及其预测特性。我们通过多种可视化及经验性指标对TransFusion进行评估,结果表明TransFusion显著超越了先前的最优方法。