Scale-Equivariant Generative Forecasting: Weight-Tied Dilated Convolutions, Wavelet Scattering Inputs, and Spectral-Consistency Training for Self-Similar Time Series

翻译：尺度等变生成式预测：针对自相似时间序列的权重绑定膨胀卷积、小波散射输入与谱一致性训练

Andrea Morandi

Many natural and engineered time series -- equity returns, climate anomalies, turbulent velocities, neural recordings, packet-level network traffic -- are approximately self-similar: their horizon-$T$ distribution is tied to the horizon-$1$ distribution by one scaling exponent $H$. Standard deep generative sequence models (transformers, dilated TCNs, the WaveNet family) ignore this. Their receptive fields are wide, but kernel parameters live independently at every dilation level, yielding a multi-scale architecture, not a scale-equivariant one. We make three contributions. First, we give a precise definition of discrete scale equivariance for 1D causal networks and prove that dyadic dilation commutes (up to boundary effects) with any dilated-convolution stack whose kernel weights are shared across levels. Tying the kernel shrinks the convolutional parameter budget by an $L$-fold factor (where $L$ is depth) and hard-wires self-similarity in as an inductive bias. Second, we wrap this Scale-Equivariant WaveNet (SE-WaveNet) backbone in three components that carry the same prior: a one-level Daubechies-4 wavelet input, a Hurst-FiLM block exposing the local scaling exponent, and a spectral-consistency training term targeting the $|f|^{-(2H+1)}$ power-law spectrum. The head is a conditional normalising flow, chosen to preserve equivariance. Third, on 30 years of S&P 500 daily log-returns, SE-WaveNet samples reproduce the empirical scaling-collapse diagnostic on the Allan-Variance top-25 universe (median $\mathcal{C}^\star = 0.020$), while a vanilla WaveNet at matched capacity does not ($\geq 0.06$). NLL, KS-calibration, and tail energy distance tie or beat the baseline, with $L\times$ fewer convolutional parameters.

翻译：许多自然与工程时间序列——权益收益率、气候异常、湍流速度、神经电生理记录、数据包级网络流量——近似具有自相似性：其跨度为$T$的分布与跨度为$1$的分布之间仅通过单一标度指数$H$相联系。标准深度生成式序列模型（Transformer、膨胀TCN、WaveNet系列）忽略这一特性。尽管它们的感受野很宽，但核参数在每个膨胀层级独立存在，由此形成多尺度架构而非尺度等变架构。我们提出三项贡献：第一，为一维因果网络给出了离散尺度等变性的精确定义，并证明二进膨胀与任何跨层级共享核权重的膨胀卷积堆栈（边界效应除外）可交换。绑定核参数使卷积参数量缩减$L$倍（$L$为网络深度），并将自相似性以归纳偏置形式硬编码。第二，将尺度等变WaveNet（SE-WaveNet）主干网络与三个承载相同先验的组件结合：单层Daubechies-4小波输入、暴露局部标度指数的Hurst-FiLM模块，以及针对$|f|^{-(2H+1)}$幂律谱的谱一致性训练项。输出模块采用保持等变性的条件归一化流。第三，基于标普500指数30年日度对数收益率数据，SE-WaveNet在Allan方差排名前25的资产组合上复现了经验标度坍塌诊断（中位数$\mathcal{C}^\star = 0.020$），而相同容量的标准WaveNet未能实现（$\geq 0.06$）。负对数似然、KS校准度与尾部能量距离指标与基线相当或更优，且卷积参数减少$L$倍。