Pre-trained models have become pivotal in enhancing the efficiency and accuracy of time series forecasting on target data sets by leveraging transfer learning. While benchmarks validate the performance of model generalization on various target data sets, there is no structured research providing similarity and diversity measures to explain which characteristics of source and target data lead to transfer learning success. Our study pioneers in systematically evaluating the impact of source-target similarity and source diversity on zero-shot and fine-tuned forecasting outcomes in terms of accuracy, bias, and uncertainty estimation. We investigate these dynamics using pre-trained neural networks across five public source datasets, applied to forecasting five target data sets, including real-world wholesales data. We identify two feature-based similarity and diversity measures, finding that source-target similarity reduces forecasting bias, while source diversity improves forecasting accuracy and uncertainty estimation, but increases the bias.
翻译:预训练模型通过利用迁移学习,在提升目标数据集上时间序列预测的效率和准确性方面已变得至关重要。尽管基准测试验证了模型在各种目标数据集上的泛化性能,但目前尚无系统性研究提供相似性与多样性度量,以解释源数据和目标数据的哪些特征会导致迁移学习的成功。本研究开创性地系统评估了源-目标相似性与源数据多样性在零样本和微调预测结果(涉及准确性、偏差和不确定性估计)上的影响。我们使用在五个公共源数据集上预训练的神经网络,并将其应用于五个目标数据集(包括现实世界批发数据)的预测,以探究这些动态关系。我们提出了两种基于特征的相似性与多样性度量,发现源-目标相似性能降低预测偏差,而源数据多样性虽能提升预测准确性和不确定性估计,却会增加偏差。