We propose a new class of models for variable clustering called Asymptotic Independent block (AI-block) models, which defines population-level clusters based on the independence of the maxima of a multivariate stationary mixing random process among clusters. This class of models is identifiable, meaning that there exists a maximal element with a partial order between partitions, allowing for statistical inference. We also present an algorithm for recovering the clusters of variables without specifying the number of clusters \emph{a priori}. Our work provides some theoretical insights into the consistency of our algorithm, demonstrating that under certain conditions it can effectively identify clusters in the data with a computational complexity that is polynomial in the dimension. This implies that groups can be learned nonparametrically in which block maxima of a dependent process are only sub-asymptotic. To further illustrate the significance of our work, we applied our method to neuroscience and environmental real-datasets. These applications highlight the potential and versatility of the proposed approach.
翻译:我们提出了一类新的变量聚类模型,称为渐近独立块(AI-block)模型。该类模型基于多元平稳混合随机过程各簇内最大值之间的独立性来定义总体层面的聚类。此类模型具有可辨识性,即存在一个在划分间具有偏序关系的极大元,从而支持统计推断。我们还提出了一种无需预先指定簇数量的变量聚类恢复算法。我们的工作为算法的相合性提供了理论见解,证明在特定条件下,该算法能以多项式计算复杂度有效识别数据中的聚类结构。这意味着我们可以在非参数条件下学习到依赖过程块最大值仅为次渐近的组群。为进一步说明本工作的意义,我们将该方法应用于神经科学与环境科学领域真实数据集。这些应用彰显了所提方法的潜力与普适性。