Foundation Models (FMs) have improved time series forecasting in various sectors, such as finance, but their vulnerability to input disturbances can hinder their adoption by stakeholders, such as investors and analysts. To address this, we propose a causally grounded rating framework to study the robustness of Foundational Models for Time Series (FMTS) with respect to input perturbations. We evaluate our approach to the stock price prediction problem, a well-studied problem with easily accessible public data, evaluating six state-of-the-art (some multi-modal) FMTS across six prominent stocks spanning three industries. The ratings proposed by our framework effectively assess the robustness of FMTS and also offer actionable insights for model selection and deployment. Within the scope of our study, we find that (1) multi-modal FMTS exhibit better robustness and accuracy compared to their uni-modal versions and, (2) FMTS pre-trained on time series forecasting task exhibit better robustness and forecasting accuracy compared to general-purpose FMTS pre-trained across diverse settings. Further, to validate our framework's usability, we conduct a user study showcasing FMTS prediction errors along with our computed ratings. The study confirmed that our ratings reduced the difficulty for users in comparing the robustness of different systems.
翻译:基础模型(FMs)在金融等多个领域提升了时序预测能力,但其对输入扰动的脆弱性阻碍了投资者、分析师等利益相关方的采用。为此,我们提出一种基于因果关系的评级框架,以研究时序基础模型(FMTS)针对输入扰动的鲁棒性。我们在股价预测这一具有公开数据且被广泛研究的问题上评估了该方法,测试了涵盖三个行业的六只重要股票,并评估了六种最先进的(部分为多模态)FMTS。本框架提出的评级不仅能有效评估FMTS的鲁棒性,还能为模型选择与部署提供可操作的见解。在本研究范围内,我们发现:(1)多模态FMTS相较于其单模态版本展现出更好的鲁棒性与准确性;(2)在时序预测任务上预训练的FMTS,相比在多样化场景中预训练的通用基础模型,具有更优的鲁棒性与预测精度。此外,为验证框架的实用性,我们开展了用户研究,在展示FMTS预测误差的同时呈现计算所得的评级。研究证实,我们的评级降低了用户比较不同系统鲁棒性的难度。