Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two-stageAggregation process with Dynamic local Attention to harmonize time-wise and feature-wise irregularities in multivariate time series. In the first stage, the irregular time series undergoes temporal embedding (TE) using all available features at each time step. This process preserves the contribution of each available feature and generates a fixed-dimensional representation per time step. The second stage introduces a dynamic local attention (DLA) mechanism with adaptive window sizes. DLA aggregates time recordings using feature-specific windows to harmonize irregular time intervals capturing feature-specific sampling rates. Then hierarchical MLP mixer layers process the output of DLA through multiscale patching to leverage information at various scales for the downstream tasks. TADA outperforms state-of-the-art methods on three real-world datasets, including the latest MIMIC IV dataset, and highlights its effectiveness in handling irregular multivariate time series and its potential for various real-world applications.
翻译:不规则多变量时间序列数据的特征在于,观测变量/信号(即特征)的连续观测时间间隔不相等,且各特征之间的采样率(即记录/测量)也不同。在机器学习方法中,考虑这些不规则性对时间序列进行建模仍是一项具有挑战性的任务。本文提出TADA——一种带有动态局部注意力的两阶段聚合过程,用于协调多变量时间序列中时间维度和特征维度的不规则性。在第一阶段,不规则时间序列通过时间嵌入(TE)处理,利用每个时间步上所有可用特征。该过程保留每个可用特征的贡献,并为每个时间步生成固定维度的表示。第二阶段引入具有自适应窗口大小的动态局部注意力(DLA)机制。DLA使用特征特定窗口聚合时间记录,以协调不规则时间间隔,捕获各特征的采样率差异。随后,分层MLP混合器层通过多尺度分块处理DLA的输出,从而利用不同尺度的信息支持下游任务。TADA在三个真实世界数据集(包括最新的MIMIC IV数据集)上优于现有最先进方法,充分展示了其处理不规则多变量时间序列的有效性,并凸显了其在多种实际应用中的潜力。