We study adaptive pooling under predictive heterogeneity in high-dimensional multivariate time series forecasting, where global models improve statistical efficiency but may fail to capture heterogeneous predictive structure, while naive specialization can induce negative transfer. We formulate adaptive pooling as a statistical decision problem and propose a validation-driven framework that determines when and how specialization should be applied. Rather than grouping series based on representation similarity, we define partitions through out-of-sample predictive performance, thereby aligning data organization with predictive risk, defined as expected out-of-sample loss and approximated via validation error. Cluster assignments are iteratively updated using validation losses for both point (Huber) and probabilistic (pinball) forecasting, improving robustness to heavy-tailed errors and local anomalies. To ensure reliability, we introduce a leakage-free fallback mechanism that reverts to a global model whenever specialization fails to improve validation performance, providing a safeguard against performance degradation under a strict training-validation-test protocol. Experiments on large-scale traffic datasets demonstrate consistent improvements over strong baselines while avoiding degradation when heterogeneity is weak. Overall, the proposed framework provides a principled and practically reliable approach to adaptive pooling in high-dimensional forecasting problems.
翻译:我们研究了高维多元时间序列预测中预测异质性下的自适应池化问题,其中全局模型能提升统计效率但可能无法捕捉异质性预测结构,而朴素特化则可能引发负迁移。本文将自适应池化形式化为统计决策问题,并提出一种基于验证驱动的框架,以确定何时及如何应用特化。不同于基于表示相似性对序列进行分组,我们通过样本外预测性能定义分区,从而将数据组织与预测风险对齐。预测风险定义为期望样本外损失,并通过验证误差近似。簇分配通过点预测(Huber)和概率预测(pinball)的验证损失迭代更新,提升了对重尾误差和局部异常值的鲁棒性。为确保可靠性,我们引入无泄漏回退机制:当特化未能改善验证性能时,自动回退至全局模型,从而在严格训练-验证-测试协议下提供性能退化防护。基于大规模交通数据集的实验表明,该框架在异质性较弱时仍能避免性能退化,持续超越强基线模型。总体而言,所提框架为高维预测问题中的自适应池化提供了原则性且实践可靠的方法。