Existing hierarchical forecasting techniques scale poorly when the number of time series increases. We propose to learn a coherent forecast for millions of time series with a single bottom-level forecast model by using a sparse loss function that directly optimizes the hierarchical product and/or temporal structure. The benefit of our sparse hierarchical loss function is that it provides practitioners a method of producing bottom-level forecasts that are coherent to any chosen cross-sectional or temporal hierarchy. In addition, removing the need for a post-processing step as required in traditional hierarchical forecasting techniques reduces the computational cost of the prediction phase in the forecasting pipeline. On the public M5 dataset, our sparse hierarchical loss function performs up to 10% (RMSE) better compared to the baseline loss function. We implement our sparse hierarchical loss function within an existing forecasting model at bol, a large European e-commerce platform, resulting in an improved forecasting performance of 2% at the product level. Finally, we found an increase in forecasting performance of about 5-10% when evaluating the forecasting performance across the cross-sectional hierarchies that we defined. These results demonstrate the usefulness of our sparse hierarchical loss applied to a production forecasting system at a major e-commerce platform.
翻译:现有层次化预测技术在时间序列数量增加时扩展性较差。我们提出通过使用直接优化层次化乘积和/或时间结构的稀疏损失函数,利用单一底层预测模型对数百万时间序列学习一致性预测。该稀疏层次化损失函数的优势在于:为实践者提供了一种生成对任意选择的横截面或时间层次结构具有一致性的底层预测方法。此外,省去传统层次化预测技术中所需的后处理步骤,降低了预测流程中预测阶段的计算成本。在公开M5数据集上,与基准损失函数相比,我们的稀疏层次化损失函数性能提升高达10%(RMSE)。我们在欧洲大型电商平台bol的现有预测模型中实现该稀疏层次化损失函数,使产品级预测性能提升2%。最后,当评估所定义的横截面层次上的预测性能时,我们发现预测性能提升约5-10%。这些结果证明了稀疏层次化损失函数在大型电商平台生产级预测系统中的实用性。