Change point methods are used to divide a sequence of observations into segments with different behaviour. Often, the distributional form of the observations is unknown, but the changes of interest are likely to involve shifts in location, scale, or both. We consider the problem of detecting multiple change points in a sequence without specifying a parametric model for the data. We propose the WBS-Lepage procedure, a nonparametric method which combines wild binary segmentation with a rank-based Lepage statistic. The statistic is formed from Mann--Whitney and Mood components, which are respectively sensitive to changes in location and scale. Since it depends on the observations only through their ranks, its null distribution is distribution-free. This allows finite-sample thresholds to be calibrated by Monte Carlo simulation, providing direct control over the probability of falsely detecting change points when none exist. We compare WBS-Lepage with existing nonparametric change point methods, including penalised likelihood and binary-segmentation-based competitors. The proposed method performs competitively for location changes and is particularly effective for detecting changes in scale. We illustrate the procedure on a stylometric analysis of changes in an author's writing style and provide an implementation of our method in the accompanying R package npwbs.
翻译:变化点方法用于将观测序列划分为具有不同行为的分段。通常,观测的分布形式未知,但感兴趣的变化可能涉及位置、尺度或两者的偏移。我们考虑在不指定数据参数模型的情况下,检测序列中多个变化点的问题。我们提出WBS-Lepage程序,这是一种非参数方法,将野二分法与基于秩的Lepage统计量相结合。该统计量由Mann-Whitney和Mood分量构成,分别对位置和尺度的变化敏感。由于统计量仅通过秩依赖观测值,其零分布无需依赖分布假设。这使得可通过蒙特卡洛模拟校准有限样本阈值,从而直接控制在无变化点存在时错误检测变化点的概率。我们将WBS-Lepage与现有非参数变化点方法(包括惩罚似然法和基于二分法的竞争方法)进行比较。所提方法在位置变化场景中表现优秀,且对尺度变化的检测尤为有效。我们通过作者写作风格的文体计量分析示例展示了该程序,并在配套R包npwbs中提供了方法实现。