Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by using the industry-standard MapReduce framework. The proposed model combination approach facilitates distributed time series forecasting by combining the local estimators of time series models delivered from worker nodes and minimizing a global loss function. In this way, instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we make assumptions only on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed approach with AutoRegressive Integrated Moving Average (ARIMA) models using the real data application as well as numerical simulations. Compared to directly fitting the whole data with ARIMA models, our approach results in improved forecasting accuracy and computational efficiency both in point forecasts and prediction intervals, especially for longer forecast horizons. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.
翻译:为超长时间序列提供预测在投资决策、工业生产安排及农业管理等活动中具有重要作用。本文利用行业标准的MapReduce框架,提出一种新型分布式预测框架以应对超长时间序列预测的挑战。该模型组合方法通过整合从工作节点传递的时间序列模型局部估计量并最小化全局损失函数,实现分布式时间序列预测。通过这种方式,我们不再假设超长时间序列的数据生成过程(DGP)保持恒定,而是仅对较短时间跨度子序列的DGP进行假设。我们使用自回归积分滑动平均(ARIMA)模型,通过实际数据应用和数值仿真验证了所提方法的性能。与直接对全量数据拟合ARIMA模型相比,本方法在点预测和预测区间方面均提升了预测精度与计算效率,尤其在较长预测时域上表现显著。此外,我们探究了可能影响该方法预测性能的潜在因素。