Adapting Time Series Foundation Models through Data Mixtures

Time series foundation models (TSFMs) have become increasingly popular for zero-shot forecasting. However, for a new time series domain not fully covered by the pretraining set, performance can suffer. Therefore, when a practitioner cares about a new domain and has access to a set of related datasets, the question arises: how best to fine-tune a TSFM to improve zero-shot forecasting? A typical approach to this type of problem is to fine-tune a LoRA module on all datasets or separately on each dataset. Tuning a separate module on each dataset allows for the specialisation of the TSFM to different types of data distribution, by selecting differing combinations of per-dataset modules for different time series contexts. However, we find that, using per-dataset modules might not be optimal, since a time series dataset can contain data from several types of distributions, i.e. sub-domains. This can be due to the distribution shifting or having differing distributions for different dimensions of the time series. Hence, we propose MixFT which re-divides the data using Bayesian mixtures into sets that best represent the sub-domains present in the data, and fine-tunes separately on each of these sets. This re-division of the data ensures that each set is more homogeneous, leading to fine-tuned modules focused on specific sub-domains. Our experiments show that MixFT performs better than per-dataset methods and when fine-tuning a single module on all the data. This suggests that by re-partitioning the data to represent sub-domains we can better specialise TSFMs to improve zero-shot forecasting.

翻译：时间序列基础模型（TSFMs）在零样本预测领域日益普及。然而，对于预训练集未能完全覆盖的新时间序列领域，其性能可能受到影响。因此，当实践者关注新领域并能够获取一组相关数据集时，问题随之产生：如何以最佳方式微调TSFM以提升零样本预测能力？针对此类问题的典型方法是在所有数据集上或分别在每个数据集上微调LoRA模块。通过在不同时间序列场景中选择不同数据集模块的组合，为每个数据集单独调优模块可使TSFM适配不同类型的数据分布。然而，我们发现使用逐数据集模块可能并非最优方案，因为单个时间序列数据集可能包含多种分布类型（即子领域）的数据。这可能是由于分布漂移或时间序列不同维度存在差异分布所致。为此，我们提出MixFT方法，该方法利用贝叶斯混合将数据重新划分为最能代表数据中子领域的集合，并分别对这些集合进行微调。这种数据重划分确保了每个集合更具同质性，从而生成专注于特定子领域的微调模块。实验表明，MixFT的性能优于逐数据集方法以及在所有数据上微调单一模块的方法。这表明通过重新划分数据以表征子领域，我们能更有效地使TSFM专业化，从而提升零样本预测性能。