We study statistical inference on the similarity/distance between two time-series under uncertain environment by considering a statistical hypothesis test on the distance obtained from Dynamic Time Warping (DTW) algorithm. The sampling distribution of the DTW distance is too difficult to derive because it is obtained based on the solution of the DTW algorithm, which is complicated. To circumvent this difficulty, we propose to employ the conditional selective inference framework, which enables us to derive a valid inference method on the DTW distance. To our knowledge, this is the first method that can provide a valid p-value to quantify the statistical significance of the DTW distance, which is helpful for high-stake decision making such as abnormal time-series detection problems. We evaluate the performance of the proposed inference method on both synthetic and real-world datasets.
翻译:本文研究不确定环境下两个时间序列之间相似性/距离的统计推断问题,通过考虑对动态时间规整(DTW)算法所得距离进行统计假设检验。由于DTW距离基于复杂算法解获得,其抽样分布难以推导。为规避这一困难,我们提出采用条件选择性推断框架,从而能够对DTW距离推导出有效的推断方法。据我们所知,这是首个能够提供有效p值以量化DTW距离统计显著性的方法,这对于异常时间序列检测等高风险决策问题具有重要价值。我们在合成数据集和真实世界数据集上评估了所提推断方法的性能。