Despite the fundamental importance of clustering, to this day, much of the relevant research is still based on ambiguous foundations, leading to an unclear understanding of whether or how the various clustering methods are connected with each other. In this work, we provide an additional stepping stone towards resolving such ambiguities by presenting a general clustering framework that subsumes a series of seemingly disparate clustering methods, including various methods belonging to the widely popular spectral clustering framework. In fact, the generality of the proposed framework is additionally capable of shedding light to the largely unexplored area of multi-view graphs where each view may have differently clustered nodes. In turn, we propose GenClus: a method that is simultaneously an instance of this framework and a generalization of spectral clustering, while also being closely related to k-means as well. This results in a principled alternative to the few existing methods studying this special type of multi-view graphs. Then, we conduct in-depth experiments, which demonstrate that GenClus is more computationally efficient than existing methods, while also attaining similar or better clustering performance. Lastly, a qualitative real-world case-study further demonstrates the ability of GenClus to produce meaningful clusterings.
翻译:尽管聚类具有基础重要性,但时至今日,相关研究仍大多建立在模糊的基础之上,导致人们对于各种聚类方法是否或如何相互关联缺乏清晰的理解。在本工作中,我们通过提出一个通用的聚类框架,为消除此类模糊性提供了额外的垫脚石。该框架包含了一系列看似迥异的聚类方法,包括属于广受欢迎的谱聚类框架的各种方法。事实上,所提框架的普适性还能为多视图图这一很大程度上尚未探索的领域提供启示,其中每个视图可能具有不同的节点聚类结构。相应地,我们提出了GenClus方法:它既是该框架的一个实例,也是谱聚类的一种推广,同时还与k-means方法密切相关。这为研究此类特殊多视图图的少数现有方法提供了一个有理论依据的替代方案。随后,我们进行了深入的实验,结果表明GenClus比现有方法具有更高的计算效率,同时获得相似或更优的聚类性能。最后,一项定性的真实世界案例研究进一步证明了GenClus生成有意义聚类的能力。