Existing hierarchical forecasting techniques scale poorly when the number of time series increases. We propose to learn a coherent forecast for millions of time series with a single bottom-level forecast model by using a sparse loss function that directly optimizes the hierarchical product and/or temporal structure. The benefit of our sparse hierarchical loss function is that it provides practitioners a method of producing bottom-level forecasts that are coherent to any chosen cross-sectional or temporal hierarchy. In addition, removing the need for a post-processing step as required in traditional hierarchical forecasting techniques reduces the computational cost of the prediction phase in the forecasting pipeline. On the public M5 dataset, our sparse hierarchical loss function performs up to 10% (RMSE) better compared to the baseline loss function. We implement our sparse hierarchical loss function within an existing forecasting model at bol, a large European e-commerce platform, resulting in an improved forecasting performance of 2% at the product level. Finally, we found an increase in forecasting performance of about 5-10% when evaluating the forecasting performance across the cross-sectional hierarchies that we defined. These results demonstrate the usefulness of our sparse hierarchical loss applied to a production forecasting system at a major e-commerce platform.
翻译:现有层级预测技术在时间序列数量增加时扩展性较差。我们提出通过采用稀疏损失函数直接优化层级乘积和/或时间结构,利用单一底层预测模型对数百万时间序列学习一致性预测。该稀疏层级损失函数的优势在于:为实践者提供了一种生成底层预测的方法,使其能与任意选定的截面或时间层级保持一致。此外,它无需传统层级预测技术所需的后续处理步骤,从而降低了预测流程中预测阶段的计算成本。在公开M5数据集上,我们的稀疏层级损失函数相比基准损失函数性能提升高达10%(均方根误差)。我们在一家大型欧洲电商平台bol的现有预测模型中实现了该稀疏层级损失函数,使产品层面的预测性能提升2%。最后,在评估我们定义的截面层级预测性能时,我们发现预测性能提升约5-10%。这些结果证明了稀疏层级损失应用于大型电商平台生产预测系统的有效性。