Detection of clusters is a crucial task across many disciplines such as statistics, engineering and bioinformatics. We mainly focus on the modern high dimensional scenario, where traditional methods could fail due to the curse of dimensionality. In this study, we propose a non-parametric framework for clustering that can be applied to arbitrary dimensions. Simulation results show that this new framework outperforms the existing methods under a wide range of settings. We illustrate the proposed method on real data applications in distinguishing cancer tissues from normal tissues through gene expression data.
翻译:聚类检测是统计学、工程学和生物信息学等多个学科中的关键任务。我们主要关注现代高维场景,在该场景中,传统方法可能因维度灾难而失效。在本研究中,我们提出了一种可应用于任意维度的非参数聚类框架。模拟结果表明,在多种设置下,该新框架优于现有方法。我们通过基因表达数据在区分癌组织与正常组织的实际数据应用中,对所提出的方法进行了验证。