In this paper we consider change-points in multiple sequences with the objective of minimizing the estimation error of a sequence by making use of information from other sequences. This is in contrast to recent interest on change-points in multiple sequences where the focus is on detection of common change-points. We start with the canonical case of a single sequence with constant change-point intensities. We consider two measures of a change-point algorithm. The first is the probability of estimating the change-point with no error. The second is the expected distance between the true and estimated change-points. We provide a theoretical upper bound for the no error probability, and a lower bound for the expected distance, that must be satisfied by all algorithms. We propose a scan-CUSUM algorithm that achieves the no error upper bound and come close to the distance lower bound. We next consider the case of non-constant intensities and establish sharp conditions under which estimation error can go to zero. We propose an extension of the scan-CUSUM algorithm for a non-constant intensity function, and show that it achieves asymptotically zero error at the boundary of the zero-error regime. We illustrate an application of the scan-CUSUM algorithm on multiple sequences sharing an unknown, non-constant intensity function. We estimate the intensity function from the change-point profile likelihoods of all sequences and apply scan-CUSUM on the estimated intensity function.
翻译:本文研究多序列中的变点问题,目标是通过利用其他序列的信息最小化单个序列的估计误差。这与近期多序列变点研究中聚焦于检测共同变点的方向相反。我们从单一序列变点强度恒定的经典情形切入,考虑变点算法的两个度量指标:第一是零误差估计变点的概率,第二是真值与估计值之间的期望距离。我们给出了所有算法必须满足的零误差概率理论上限,以及期望距离的下界。提出的scan-CUSUM算法能够达到零误差上限且逼近距离下界。进一步,我们针对非恒定强度情形,建立了估计误差趋于零的严格条件,并提出了适用于非恒定强度函数的scan-CUSUM扩展算法,证明其在零误差边界处可渐进实现零误差。最后,我们展示了scan-CUSUM算法在共享未知非恒定强度函数的多序列上的应用:通过所有序列的变点轮廓似然估计强度函数,并将scan-CUSUM应用于该估计强度函数。