We propose a new, computationally efficient, sparsity adaptive changepoint estimator for detecting changes in unknown subsets of a high-dimensional data sequence. Assuming the data sequence is Gaussian, we prove that the new method successfully estimates the number and locations of changepoints with a given error rate and under minimal conditions, for all sparsities of the changing subset. Moreover, our method has computational complexity linear up to logarithmic factors in both the length and number of time series, making it applicable to large data sets. Through extensive numerical studies we show that the new methodology is highly competitive in terms of both estimation accuracy and computational cost. The practical usefulness of the method is illustrated by analysing sensor data from a hydro power plant. An efficient R implementation is available.
翻译:我们提出了一种新的、计算高效的稀疏自适应变点估计方法,用于检测高维数据序列中未知子集的变化。假设数据序列服从高斯分布,我们证明了新方法能够在给定误差率下,在最小条件下成功估计变点的数量和位置,适用于所有变化子集的稀疏度。此外,我们的方法的计算复杂度在时间序列长度和数量上均达到对数因子内的线性水平,使其适用于大规模数据集。通过广泛的数值研究,我们展示了新方法在估计精度和计算成本方面均具有高度竞争力。通过分析来自水力发电厂的传感器数据,我们说明了该方法的实际应用价值。该方法已提供高效的R语言实现。