Transformer-based models have emerged as leading paradigms in time-series forecasting in recent years, employing self-attention mechanisms to capture long-range dependencies. Despite their success, these single-stage forecasting architectures exhibit persistent systematic residual biases arising from structural discrepancies, unmodeled stochastic components, or inadequate multi-scale temporal representations. This limitation persists when residuals are treated as irreducible noise, precluding adaptive correction of structured error patterns. To address this limitation, we introduce a two-stage, model-agnostic framework that explicitly decouples forecasting and residual learning into distinct stages of representation learning. A base transformer first generates the initial predictions. Subsequently, a dedicated meta-corrector dynamically models structured error patterns across multivariate channels, preserves cross-variable dependencies, and iteratively refines the residual bias of the base transformer. By formalizing this pipeline as a hypothesis space expansion, our framework addresses approximation limitations inherent in single-stage architectures, removes reliance on restrictive assumptions, and enables end-to-end learning of complex error dynamics. Evaluated on eight popular benchmark datasets using established protocols, our approach achieves state-of-the-art performance, with significant improvements in standard metrics (MSE, MAE). The results demonstrate the framework's ability to mitigate systematic biases and enhance robustness to complex temporal dynamics, advancing the practical applicability of transformer-based forecasting models.
翻译:近年来,基于Transformer的模型凭借自注意力机制捕获长程依赖的能力,已成为时间序列预测领域的主流范式。然而,尽管取得显著成功,这类单阶段预测架构因结构差异、未建模的随机成分或多尺度时间表征不足,始终存在系统性残差偏差。当残差被视为不可约噪声而无法进行结构化误差模式的自适应校正时,这一缺陷尤为凸显。为克服此局限性,我们提出一种模型无关的双阶段框架,将预测过程与残差学习明确解耦为独立的表示学习阶段:基础Transformer先生成初始预测,随后专用元校正器在多变量通道间动态建模结构化误差模式、保留跨变量依赖关系,并迭代修正基础Transformer的残差偏差。通过将该流程形式化为假设空间扩展,我们的框架解决了单阶段架构固有的逼近局限性,消除了对约束性假设的依赖,并实现了复杂误差动力学的端到端学习。基于八个主流基准数据集与既定评估协议,本方法在MSE、MAE等标准指标上取得显著改进,达到当前最优性能。实验结果表明,该框架能够有效缓解系统性偏差、增强对复杂时间动力学的鲁棒性,从而推进基于Transformer的预测模型在实践中的应用价值。