In 2002, Kleinberg proposed three axioms for distance-based clustering, and proved that it was impossible for a clustering method to satisfy all three. While there has been much subsequent work examining and modifying these axioms for distance-based clustering, little work has been done to explore axioms relevant to the graph partitioning problem when the graph is unweighted and given without a distance matrix. Here, we propose and explore axioms for graph partitioning for this case, including modifications of Kleinberg's axioms and three others: two axioms relevant to the ``Resolution Limit'' and one addressing well-connectedness. We prove that clustering under the Constant Potts Model satisfies all the axioms, while Modularity clustering and iterative k-core both fail many axioms we pose. These theoretical properties of the clustering methods are relevant both for theoretical investigation as well as to practitioners considering which methods to use for their domain science studies.
翻译:2002年,Kleinberg提出了基于距离聚类的三条公理,并证明了任何聚类方法都无法同时满足这三条公理。尽管后续研究对基于距离聚类的这些公理进行了大量探讨和修正,但对于无权图且未提供距离矩阵的图划分问题,相关公理体系的探索仍显不足。本文针对此类情形提出并研究图划分的公理体系,包括对Kleinberg公理的修正以及另外三条新公理:两条涉及"分辨率极限"问题,一条关注图的连通性。我们证明恒定波特模型下的聚类满足所有公理,而模块度聚类和迭代k核方法均未能满足我们提出的多项公理。这些聚类方法的理论特性既具有理论研究价值,也为领域科学研究者在选择适用方法时提供了理论依据。