The ongoing digitization has led to a proliferation of time-series data streams that monitor a variety of processes, from which valuable insights may be obtained. Further, the emergence of successful foundational language models begs the question of whether it is possible to achieve time-series models with the foundational properties of handling multiple tasks, while being sufficiently lightweight to allow real-time data stream processing. Existing foundational time-series models are often large and only effective in offline settings without stringent time and computational constraints, and where repeated model calibration is not needed. However, when applied to data streams, these models are ineffective due to their size and lack of support for continual calibration, which compromise their ability to deliver accurate real-time responses, their durability, and their deployability in hardware-limited settings. We propose TimeBlocks to enable versatile time-series processing by facilitating the efficient building of lightweight models suitable for multiple tasks under variable conditions. In particular, the method maintains a pool of interchangeable and modular model blocks that can be used to construct new time-series models. When presented with specific time-series data, a routing strategy iteratively selects the most suitable blocks to construct a lightweight and accurate model for the data. We equip TimeBlocks with a method called StreamCore to build a representative small subset of the data stream, which preserves a guaranteed approximation of the stream over time, enabling continual model calibration. An experimental study on multiple data sets and covering multiple tasks shows that TimeBlocks enables to build models capable of outperforming existing baselines.
翻译:持续数字化进程催生了大量监测各类流程的时间序列数据流,从中可获取宝贵洞见。此外,成功的基础语言模型的出现引发了一个问题:是否有可能实现兼具处理多任务基础特性,又足够轻量以支持实时数据流处理的时间序列模型?现有基础时间序列模型通常规模庞大,仅在无严格时间与计算约束的离线环境下有效,且无需重复模型校准。然而,当应用于数据流时,这些模型因体积庞大且缺乏持续校准支持而效果不佳,这会损害其提供精准实时响应、持久性以及在硬件受限环境下的部署能力。我们提出TimeBlocks,通过促进在可变条件下高效构建适用于多任务的轻量级模型,实现多样化的时间序列处理。具体而言,该方法维护一个可互换的模块化模型块池,可用于构建新的时间序列模型。当面对特定时间序列数据时,路由策略会迭代选择最合适的模型块,为数据构建轻量而精准的模型。我们为TimeBlocks配备了一种名为StreamCore的方法,用于构建数据流的一个代表性小子集,该子集可保证随时间推移对数据流的近似,从而支持持续的模型校准。一项覆盖多个数据集与多项任务的实验研究表明,TimeBlocks能够构建优于现有基线的模型。