The paper studies the problem of detecting and locating change points in multivariate time-evolving data. The problem has a long history in statistics and signal processing and various algorithms have been developed primarily for simple parametric models. In this work, we focus on modeling the data through feed-forward neural networks and develop a detection strategy based on the following two-step procedure. In the first step, the neural network is trained over a prespecified window of the data, and its test error function is calibrated over another prespecified window. Then, the test error function is used over a moving window to identify the change point. Once a change point is detected, the procedure involving these two steps is repeated until all change points are identified. The proposed strategy yields consistent estimates for both the number and the locations of the change points under temporal dependence of the data-generating process. The effectiveness of the proposed strategy is illustrated on synthetic data sets that provide insights on how to select in practice tuning parameters of the algorithm and in real data sets. Finally, we note that although the detection strategy is general and can work with different neural network architectures, the theoretical guarantees provided are specific to feed-forward neural architectures.
翻译:本文研究多元时变数据中变点的检测与定位问题。该问题在统计学与信号处理领域具有悠久历史,已有多种算法主要针对简单参数模型开发。本研究重点通过前馈神经网络对数据进行建模,并提出基于以下两步流程的检测策略:首先,在预设数据窗口上训练神经网络,并在另一预设窗口上校准其测试误差函数;随后,在滑动窗口中使用该测试误差函数识别变点。一旦检测到变点,即重复执行这两步流程直至所有变点被识别。在数据生成过程存在时间依赖性的条件下,所提策略能对变点数量与位置给出一致估计。通过合成数据集验证了策略的有效性,这些数据集为算法调优参数的实际选择提供了参考依据,并在真实数据集中得到进一步验证。需要说明的是,尽管该检测策略具有普适性且适用于不同神经网络架构,但所提供的理论保证仅针对前馈神经网络结构。