Change-point models deal with ordered data sequences. Their primary goal is to infer the locations where an aspect of the data sequence changes. In this paper, we propose and implement a nonparametric Bayesian model for clustering observations based on their constant-wise change-point profiles via Gibbs sampler. Our model incorporates a Dirichlet Process on the constant-wise change-point structures to cluster observations while performing change-point estimation simultaneously. Additionally, our approach controls the number of clusters in the model, not requiring the specification of the number of clusters a priori. Our method's performance is evaluated on simulated data under various scenarios and on a publicly available single-cell copy-number dataset.
翻译:变化点模型处理有序数据序列,其主要目标是推断数据序列中某一特征发生变化的位点。本文提出并实现了一种基于吉布斯采样的非参数贝叶斯模型,用于根据观测数据的常数分段变化点特征对观测值进行聚类。该模型在常数分段变化点结构上引入狄利克雷过程,在同时进行变化点估计的过程中实现观测值的聚类。此外,该方法能够控制模型中的聚类数量,无需预先指定聚类数。我们在多种模拟场景下以及公开的单细胞拷贝数数据集上评估了所提方法的性能。