Change-point detection studies the problem of detecting the changes in the underlying distribution of the data stream as soon as possible after the change happens. Modern large-scale, high-dimensional, and complex streaming data call for computationally (memory) efficient sequential change-point detection algorithms that are also statistically powerful. This gives rise to a computation versus statistical power trade-off, an aspect less emphasized in the past in classic literature. This tutorial takes this new perspective and reviews several sequential change-point detection procedures, ranging from classic sequential change-point detection algorithms to more recent non-parametric procedures that consider computation, memory efficiency, and model robustness in the algorithm design. Our survey also contains classic performance analysis, which still provides useful techniques for analyzing new procedures.
翻译:摘要:变点检测研究的是在数据流底层分布发生变化后,尽可能迅速地检测出该变化的问题。现代大规模、高维且复杂的流式数据,要求设计同时具备计算(内存)高效性与统计效力的序列变点检测算法。这引发了计算能力与统计性能之间的权衡——这一方面在过去经典文献中较少被强调。本教程从这一新视角出发,综述了多种序列变点检测方法,涵盖从经典序列变点检测算法到近期考虑计算效率、内存效率及模型鲁棒性的非参数方法。我们的综述还包含了经典的性能分析,这些分析仍为分析新方法提供有用的技术手段。