We study statistical inference on the similarity/distance between two time-series under uncertain environment by considering a statistical hypothesis test on the distance obtained from Dynamic Time Warping (DTW) algorithm. The sampling distribution of the DTW distance is too difficult to derive because it is obtained based on the solution of the DTW algorithm, which is complicated. To circumvent this difficulty, we propose to employ the conditional selective inference framework, which enables us to derive a valid inference method on the DTW distance. To our knowledge, this is the first method that can provide a valid p-value to quantify the statistical significance of the DTW distance, which is helpful for high-stake decision making such as abnormal time-series detection problems. We evaluate the performance of the proposed inference method on both synthetic and real-world datasets.
翻译:我们研究了不确定环境下两个时间序列之间相似性/距离的统计推断问题,通过构建动态时间规整(DTW)算法所得距离的统计假设检验来实现。由于DTW距离是基于算法复杂解获得的,其抽样分布极其难以推导。为克服这一困难,我们提出采用条件选择性推断框架,该框架能够为DTW距离建立有效的推断方法。据我们所知,这是首个能够为DTW距离提供有效p值以量化其统计显著性的方法,有助于解决异常时间序列检测等高风险决策问题。我们在合成数据集和真实数据集上评估了所提推断方法的性能。