Random noise arising from physical processes is an inherent characteristic of measurements and a limiting factor for most signal processing and data analysis tasks. Given the recent interest in generative adversarial networks (GANs) for data-driven modeling, it is important to determine to what extent GANs can faithfully reproduce noise in target data sets. In this paper, we present an empirical investigation that aims to shed light on this issue for time series. Namely, we assess two general-purpose GANs for time series that are based on the popular deep convolutional GAN (DCGAN) architecture, a direct time-series model and an image-based model that uses a short-time Fourier transform (STFT) data representation. The GAN models are trained and quantitatively evaluated using distributions of simulated noise time series with known ground-truth parameters. Target time series distributions include a broad range of noise types commonly encountered in physical measurements, electronics, and communication systems: band-limited thermal noise, power law noise, shot noise, and impulsive noise. We find that GANs are capable of learning many noise types, although they predictably struggle when the GAN architecture is not well suited to some aspects of the noise, e.g., impulsive time-series with extreme outliers. Our findings provide insights into the capabilities and potential limitations of current approaches to time-series GANs and highlight areas for further research. In addition, our battery of tests provides a useful benchmark to aid the development of deep generative models for time series.
翻译:物理过程中产生的随机噪声是测量的固有特性,也是大多数信号处理与数据分析任务的限制因素。鉴于近年来生成对抗网络在数据驱动建模中的研究热度,确定GANs能否忠实地再现目标数据集中的噪声至关重要。本文通过实证研究探讨时序数据中的这一问题:我们评估了两种基于流行深度卷积GAN架构的通用时序GAN模型——直接时序模型与基于短时傅里叶变换数据表示的图像模型。通过已知真值参数的模拟噪声时间序列分布,对GAN模型进行训练与定量评估。目标时间序列分布涵盖物理测量、电子系统与通信系统中常见的各类噪声:带限热噪声、幂律噪声、散粒噪声与脉冲噪声。研究发现,GANs能够学习多种噪声类型,但当网络架构与噪声特征(如包含极端离群值的脉冲时序)不匹配时会出现可预期的困难。本研究揭示了当前时序GAN方法的能力边界与潜在局限,为后续研究指明方向。此外,我们设计的测试体系为时序深度生成模型的发展提供了实用基准。