We propose a new, computationally efficient, sparsity adaptive changepoint estimator for detecting changes in unknown subsets of a high-dimensional data sequence. Assuming the data sequence is Gaussian, we prove that the new method successfully estimates the number and locations of changepoints with a given error rate and under minimal conditions, for all sparsities of the changing subset. Moreover, our method has computational complexity linear up to logarithmic factors in both the length and number of time series, making it applicable to large data sets. Through extensive numerical studies we show that the new methodology is highly competitive in terms of both estimation accuracy and computational cost. The practical usefulness of the method is illustrated by analysing sensor data from a hydro power plant. An efficient R implementation is available.
翻译:我们提出了一种新的、计算高效的稀疏自适应变点估计器,用于检测高维数据序列中未知子集的变化。假设数据序列服从高斯分布,我们证明新方法能够在给定误差率和最小条件下,成功估计所有稀疏度变化子集中的变点数量及位置。此外,该方法的时间复杂度关于时间序列长度和数量均呈对数因子线性关系,使其适用于大规模数据集。通过广泛的数值实验,我们展示了新方法在估计精度和计算成本方面均具有高度竞争力。通过分析来自水电站的传感器数据,我们说明了该方法的实际应用价值。该方法的高效R语言实现已公开提供。