Generative adversarial networks (GANs) have been extremely successful in generating samples, from seemingly high dimensional probability measures. However, these methods struggle to capture the temporal dependence of joint probability distributions induced by time-series data. Furthermore, long time-series data streams hugely increase the dimension of the target space, which may render generative modelling infeasible. To overcome these challenges, motivated by the autoregressive models in econometric, we are interested in the conditional distribution of future time series given the past information. We propose the generic conditional Sig-WGAN framework by integrating Wasserstein-GANs (WGANs) with mathematically principled and efficient path feature extraction called the signature of a path. The signature of a path is a graded sequence of statistics that provides a universal description for a stream of data, and its expected value characterises the law of the time-series model. In particular, we develop the conditional Sig-$W_1$ metric, that captures the conditional joint law of time series models, and use it as a discriminator. The signature feature space enables the explicit representation of the proposed discriminators which alleviates the need for expensive training. We validate our method on both synthetic and empirical dataset and observe that our method consistently and significantly outperforms state-of-the-art benchmarks with respect to measures of similarity and predictive ability.
翻译:生成对抗网络(GANs)在从看似高维概率测度中生成样本方面取得了极大成功。然而,这些方法难以捕捉时间序列数据所诱导的联合概率分布中的时间依赖性。此外,长时序数据流会极大增加目标空间的维度,这可能使生成建模变得不可行。为应对这些挑战,受计量经济学中自回归模型的启发,我们关注给定过去信息条件下未来时间序列的条件分布。我们提出通用的条件Sig-WGAN框架,该框架将Wasserstein生成对抗网络(WGANs)与数学上严谨且高效的路径特征提取方法——路径签名(signature of a path)相结合。路径签名是一系列分阶统计量,为数据流提供通用描述,其期望值刻画了时间序列模型的分布律。特别地,我们发展了条件Sig-$W_1$度量,以捕捉时间序列模型的条件联合分布律,并将其用作判别器。签名特征空间使我们能够显式表示所提出的判别器,从而避免昂贵的训练过程。我们在合成数据集和实证数据集上验证了该方法,并观察到,在相似性度量和预测能力方面,我们的方法始终显著优于当前最先进的基准方法。