Temporal distribution shifts are ubiquitous in time series data. One of the most popular methods assumes that the temporal distribution shift occurs uniformly to disentangle the stationary and nonstationary dependencies. But this assumption is difficult to meet, as we do not know when the distribution shifts occur. To solve this problem, we propose to learn IDentifiable latEnt stAtes (IDEA) to detect when the distribution shifts occur. Beyond that, we further disentangle the stationary and nonstationary latent states via sufficient observation assumption to learn how the latent states change. Specifically, we formalize the causal process with environment-irrelated stationary and environment-related nonstationary variables. Under mild conditions, we show that latent environments and stationary/nonstationary variables are identifiable. Based on these theories, we devise the IDEA model, which incorporates an autoregressive hidden Markov model to estimate latent environments and modular prior networks to identify latent states. The IDEA model outperforms several latest nonstationary forecasting methods on various benchmark datasets, highlighting its advantages in real-world scenarios.
翻译:时间分布偏移在时间序列数据中普遍存在。一种流行方法假设时间分布偏移均匀发生,以解耦平稳与非平稳依赖性。但该假设难以满足,因为我们无法预知分布偏移何时发生。为解决此问题,我们提出学习可识别潜在状态(IDEA)以检测分布偏移的发生时机。此外,我们通过充分观测假设进一步解耦平稳与非平稳潜在状态,以学习潜在状态的变化机制。具体而言,我们采用与环境无关的平稳变量和与环境相关的非平稳变量形式化因果过程。在温和条件下,我们证明潜在环境及平稳/非平稳变量具有可识别性。基于这些理论,我们设计了IDEA模型,该模型结合自回归隐马尔可夫模型估计潜在环境,并利用模块化先验网络识别潜在状态。IDEA模型在多个基准数据集上优于若干最新非平稳预测方法,凸显了其在现实场景中的优势。