The computation of the distance of two time series is time-consuming for any elastic distance function that accounts for misalignments. Among those functions, DTW is the most prominent. However, a recent extensive evaluation has shown that the move-split merge (MSM) metric is superior to DTW regarding the analytical accuracy of the 1-NN classifier. Unfortunately, the running time of the standard dynamic programming algorithm for MSM distance computation is $\Omega(n^2)$, where $n$ is the length of the longest time series. In this paper, we provide approaches to reducing the cost of MSM distance computations by using lower and upper bounds for early pruning paths in the underlying dynamic programming table. For the case of one time series being a constant, we present a linear-time algorithm. In addition, we propose new linear-time heuristics and adapt heuristics known from DTW to computing the MSM distance. One heuristic employs the metric property of MSM and the previously introduced linear-time algorithm. Our experimental studies demonstrate substantial speed-ups in our approaches compared to previous MSM algorithms. In particular, the running time for MSM is faster than a state-of-the-art DTW distance computation for a majority of the popular UCR data sets.
翻译:任何考虑错位的弹性距离函数在计算两个时间序列的距离时都很耗时。在这些函数中,DTW最为突出。然而,近期一项广泛评估显示,就1-NN分类器的分析准确性而言,移动-分割-合并(MSM)度量优于DTW。遗憾的是,MSM距离计算的标准动态规划算法运行时间为 $\Omega(n^2)$,其中 $n$ 是最长时间序列的长度。本文通过利用下界和上界对底层动态规划表中的路径进行早期剪枝,提出了降低MSM距离计算成本的方法。针对一个时间序列为常数的情况,我们提出了一种线性时间算法。此外,我们提出了新的线性时间启发式方法,并将已知的DTW启发式方法改编应用于MSM距离计算。其中一种启发式方法利用了MSM的度量性质及先前引入的线性时间算法。我们的实验研究表明,与先前的MSM算法相比,所提出的方法实现了显著的加速。特别是,对于大多数流行的UCR数据集,MSM的运行时间快于当前最优的DTW距离计算。