Fourier analysis has been an instrumental tool in the development of signal processing. This leads us to wonder whether this framework could similarly benefit generative modelling. In this paper, we explore this question through the scope of time series diffusion models. More specifically, we analyze whether representing time series in the frequency domain is a useful inductive bias for score-based diffusion models. By starting from the canonical SDE formulation of diffusion in the time domain, we show that a dual diffusion process occurs in the frequency domain with an important nuance: Brownian motions are replaced by what we call mirrored Brownian motions, characterized by mirror symmetries among their components. Building on this insight, we show how to adapt the denoising score matching approach to implement diffusion models in the frequency domain. This results in frequency diffusion models, which we compare to canonical time diffusion models. Our empirical evaluation on real-world datasets, covering various domains like healthcare and finance, shows that frequency diffusion models better capture the training distribution than time diffusion models. We explain this observation by showing that time series from these datasets tend to be more localized in the frequency domain than in the time domain, which makes them easier to model in the former case. All our observations point towards impactful synergies between Fourier analysis and diffusion models.
翻译:傅立叶分析一直是信号处理发展中的重要工具。这使我们思考:这一框架是否同样能惠及生成式建模?在本文中,我们通过时间序列扩散模型的视角探究这一问题。具体而言,我们分析了将时间序列表示为频域表示是否为基于分数的扩散模型提供了有效的归纳偏置。通过从时域中扩散的经典随机微分方程(SDE)表述出发,我们证明了频域中会出现一个对偶扩散过程,但有一个重要区别:布朗运动被我们称为镜像布朗运动所取代,其特点在于各分量之间具有镜像对称性。基于这一见解,我们展示了如何调整去噪分数匹配方法,以在频域中实现扩散模型。由此得到频域扩散模型,并将其与经典时域扩散模型进行比较。我们在涵盖医疗和金融等多个领域的真实世界数据集上的实证评估表明,频域扩散模型比时域扩散模型能更好地捕捉训练数据分布。我们通过指出这些数据集中的时间序列在频域中往往比在时域中更集中这一现象来解释这一观察结果,这使得它们在频域中更易于建模。我们所有的观察结果都指向傅立叶分析与扩散模型之间具有影响力的协同效应。