This paper presents a novel clustering algorithm from the SPINEX (Similarity-based Predictions with Explainable Neighbors Exploration) algorithmic family. The newly proposed clustering variant leverages the concept of similarity and higher-order interactions across multiple subspaces to group data into clusters. To showcase the merit of SPINEX, a thorough set of benchmarking experiments was carried out against 13 algorithms, namely, Affinity Propagation, Agglomerative, Birch, DBSCAN, Gaussian Mixture, HDBSCAN, K-Means, KMedoids, Mean Shift, MiniBatch K-Means, OPTICS, Spectral Clustering, and Ward Hierarchical. Then, the performance of all algorithms was examined across 51 synthetic and real datasets from various domains, dimensions, and complexities. Furthermore, we present a companion complexity analysis to compare the complexity of SPINEX to that of the aforementioned algorithms. Our results demonstrate that SPINEX can outperform commonly adopted clustering algorithms by ranking within the top-5 best performing algorithms and has moderate complexity. Finally, a demonstration of the explainability capabilities of SPINEX, along with future research needs, is presented.
翻译:本文提出了一种新颖的聚类算法,该算法属于SPINEX(基于相似性的可解释邻域探索预测)算法家族。新提出的聚类变体利用多子空间中的相似性概念和高阶交互作用将数据分组为簇。为展示SPINEX的优势,我们针对13种算法(即亲和传播、凝聚聚类、Birch、DBSCAN、高斯混合模型、HDBSCAN、K均值、K中心点、均值漂移、小批量K均值、OPTICS、谱聚类和Ward层次聚类)进行了全面的基准实验。随后,在涵盖不同领域、维度与复杂度的51个合成及真实数据集上检验了所有算法的性能。此外,我们提出了配套的复杂度分析,将SPINEX的复杂度与上述算法进行比较。实验结果表明,SPINEX在性能排名中位列前五,能够超越常用聚类算法,且具有中等计算复杂度。最后,本文展示了SPINEX的可解释性功能,并探讨了未来的研究方向。