We frequently encounter multiple series that are temporally correlated in our surroundings, such as EEG data to examine alterations in brain activity or sensors to monitor body movements. Segmentation of multivariate time series data is a technique for identifying meaningful patterns or changes in the time series that can signal a shift in the system's behavior. However, most segmentation algorithms have been designed primarily for univariate time series, and their performance on multivariate data remains largely unsatisfactory, making this a challenging problem. In this work, we introduce a novel approach for multivariate time series segmentation using conditional independence (CI) graphs. CI graphs are probabilistic graphical models that represents the partial correlations between the nodes. We propose a domain agnostic multivariate segmentation framework `$\texttt{tGLAD}$' which draws a parallel between the CI graph nodes and the variables of the time series. Consider applying a graph recovery model $\texttt{uGLAD}$ to a short interval of the time series, it will result in a CI graph that shows partial correlations among the variables. We extend this idea to the entire time series by utilizing a sliding window to create a batch of time intervals and then run a single $\texttt{uGLAD}$ model in multitask learning mode to recover all the CI graphs simultaneously. As a result, we obtain a corresponding temporal CI graphs representation. We then designed a first-order and second-order based trajectory tracking algorithms to study the evolution of these graphs across distinct intervals. Finally, an `Allocation' algorithm is used to determine a suitable segmentation of the temporal graph sequence. $\texttt{tGLAD}$ provides a competitive time complexity of $O(N)$ for settings where number of variables $D<<N$. We demonstrate successful empirical results on a Physical Activity Monitoring data.
翻译:摘要:我们常常会遇到周围环境中多个时间上相关的序列,例如用于检测大脑活动变化的脑电图数据,或用于监测身体运动的传感器数据。多元时间序列数据的分段是一种识别时间序列中有意义的模式或变化的技术,这些变化可能预示着系统行为的转变。然而,大多数分段算法主要针对单变量时间序列设计,其在多元数据上的表现仍不尽如人意,因此这仍是一个具有挑战性的问题。在本研究中,我们提出了一种基于条件独立性(CI)图进行多元时间序列分段的新方法。CI图是一种概率图模型,用于表示节点之间的偏相关系数。我们提出了一个领域无关的多元分段框架`$\texttt{tGLAD}$`,该框架将CI图的节点与时间序列的变量之间建立对应关系。考虑将图恢复模型`$\texttt{uGLAD}$`应用于时间序列的一个短时间间隔,它将生成一个显示变量间偏相关系数的CI图。我们将这一思想扩展到整个时间序列,利用滑动窗口创建一批时间间隔,然后以多任务学习模式运行单个`$\texttt{uGLAD}$`模型,同时恢复所有CI图。由此,我们得到相应的时序CI图表示。接着,我们设计了基于一阶和二阶的轨迹跟踪算法,以研究这些图在不同时间间隔内的演化。最后,使用一种“分配”算法来确定时序图序列的合适分段。`$\texttt{tGLAD}$`在变量数$D<<N$的设置下,提供了具有竞争力的$O(N)$时间复杂度。我们在一个身体活动监测数据集上展示了成功的实证结果。