Modern time series forecasting methods, such as Transformer and its variants, have shown strong ability in sequential data modeling. To achieve high performance, they usually rely on redundant or unexplainable structures to model complex relations between variables and tune the parameters with large-scale data. Many real-world data mining tasks, however, lack sufficient variables for relation reasoning, and therefore these methods may not properly handle such forecasting problems. With insufficient data, time series appear to be affected by many exogenous variables, and thus, the modeling becomes unstable and unpredictable. To tackle this critical issue, in this paper, we develop a novel algorithmic framework for inferring the intrinsic latent factors implied by the observable time series. The inferred factors are used to form multiple independent and predictable signal components that enable not only sparse relation reasoning for long-term efficiency but also reconstructing the future temporal data for accurate prediction. To achieve this, we introduce three characteristics, i.e., predictability, sufficiency, and identifiability, and model these characteristics via the powerful deep latent dynamics models to infer the predictable signal components. Empirical results on multiple real datasets show the efficiency of our method for different kinds of time series forecasting. The statistical analysis validates the predictability of the learned latent factors.
翻译:现代时间序列预测方法,如Transformer及其变体,在序列数据建模中表现出了强大的能力。为获得高性能,它们通常依赖冗余或不可解释的结构来建模变量间的复杂关系,并通过大规模数据调整参数。然而,许多现实世界的数据挖掘任务缺乏足够的变量进行关系推理,因此这些方法可能无法妥善处理此类预测问题。在数据不足的情况下,时间序列似乎受到许多外生变量的影响,导致建模变得不稳定且难以预测。为解决这一关键问题,本文开发了一种新颖的算法框架,用于推断可观测时间序列中隐含的内在潜在因子。这些推断出的因子用于形成多个独立且可预测的信号分量,这不仅能够支持长期效率的稀疏关系推理,还能重构未来时序数据以实现精准预测。为此,我们引入了三个特性,即可预测性、充分性和可辨识性,并通过强大的深层潜在动态模型对这些特性进行建模,以推断可预测的信号分量。在多个真实数据集上的实证结果表明,我们的方法在各类时间序列预测任务中具有高效性。统计分析验证了所学潜在因子的可预测性。