In 2002, Kleinberg proposed three axioms for distance-based clustering, and proved that it was impossible for a clustering method to satisfy all three. While there has been much subsequent work examining and modifying these axioms for distance-based clustering, little work has been done to explore axioms relevant to the graph partitioning problem, i.e., when the graph is given without a distance matrix. Here, we propose and explore axioms for graph partitioning when given graphs without distance matrices, including modifications of Kleinberg's axioms for the distanceless case and two others (one axiom relevant to the ''Resolution Limit'' and one addressing well-connectedness). We prove that clustering under the Constant Potts Model satisfies all the axioms, while Modularity clustering and Iterative k-core both fail many axioms we pose. These theoretical properties of the clustering methods are relevant both for theoretical investigation as well as to practitioners considering which methods to use for their domain science studies.
翻译:2002年,Kleinberg提出了基于距离聚类的三条公理,并证明任何聚类方法都无法同时满足这三条公理。尽管后续有大量研究对这些基于距离聚类的公理进行了检验与修改,但针对图划分问题(即在给定图而不提供距离矩阵的情况下)的公理探索尚不充分。本文针对无距离矩阵的图提出并探索了划分公理,包括对Kleinberg公理在无距离情形下的修改,以及另外两条公理(一条涉及"分辨率极限",另一条关注图的连通性)。我们证明常数Potts模型下的聚类满足所有公理,而模块度聚类和迭代k-核方法均违背了我们提出的多项公理。这些聚类方法的理论性质既对理论研究具有重要意义,也有助于各领域科学研究者选择适用的方法。