Data represented as covariance-type matrices arise in many fields, including brain functional connectivity and diffusion tensor imaging. We develop the MFM-Wishart, a Bayesian model-based clustering approach for such data that combines Wishart mixture components with a mixture-of-finite-mixtures (MFM) prior, allowing joint posterior inference on both the number of clusters and clustering assignments. Theoretically, we study the properties of Wishart kernels in the context of mixture models and then establish results for posterior consistency for the number of clusters and posterior contraction of the mixing measure under standard regularity conditions. Computationally, we develop an efficient Markov chain Monte Carlo (MCMC) algorithm for posterior inference. Simulation studies show competitive clustering performance and accurate recovery of the number of clusters, even under model misspecification. We apply MFM-Wishart to cluster infants based on functional connectivity during sleep, estimated from functional near-infrared spectroscopy (fNIRS) data, illustrating the practical utility of the model and revealing interpretable heterogeneity.
翻译:以协方差矩阵形式表示的数据广泛出现在脑功能连接与弥散张量成像等研究领域。我们提出MFM-Wishart模型——一种基于贝叶斯模型的聚类方法,该模型将Wishart混合分量与有限混合先验相结合,能够对聚类数量与聚类分配进行联合后验推断。理论方面,我们研究了混合模型框架下Wishart核函数的性质,并证明在标准正则条件下聚类数量的后验一致性与混合测度的后验收缩性。计算方面,我们开发了适用于后验推断的高效马尔可夫链蒙特卡洛算法。仿真实验表明,即使在模型设定错误的情况下,该模型仍能保持竞争性聚类性能并准确恢复聚类数量。我们将MFM-Wishart模型应用于基于功能性近红外光谱数据估算的睡眠期脑功能连接数据,对婴儿进行聚类分析,验证了模型的实际应用价值,并揭示了具有可解释性的异质性特征。