We propose a graph clustering formulation based on multicut (a.k.a. weighted correlation clustering) on the complete graph. Our formulation does not need specification of the graph topology as in the original sparse formulation of multicut, making our approach simpler and potentially better performing. In contrast to unweighted correlation clustering we allow for a more expressive weighted cost structure. In dense multicut, the clustering objective is given in a factorized form as inner products of node feature vectors. This allows for an efficient formulation and inference in contrast to multicut/weighted correlation clustering, which has at least quadratic representation and computation complexity when working on the complete graph. We show how to rewrite classical greedy algorithms for multicut in our dense setting and how to modify them for greater efficiency and solution quality. In particular, our algorithms scale to graphs with tens of thousands of nodes. Empirical evidence on instance segmentation on Cityscapes and clustering of ImageNet datasets shows the merits of our approach.
翻译:我们提出一种基于完全图上多割(亦称加权相关聚类)的图聚类方法。与原始稀疏多割公式不同,本方法无需指定图拓扑结构,从而更简洁且可能实现更优性能。区别于未加权相关聚类,我们支持更具表现力的加权成本结构。在密集多割中,聚类目标以节点特征向量内积的分解形式呈现,这使得与工作在完全图上时至少具有二次表示和计算复杂度的多割/加权相关聚类相比,本方法可进行高效的公式化表达与推理。我们展示了经典多割贪婪算法在密集场景下的改写方式,以及如何通过改进算法提升效率与解的质量。特别地,我们的算法可扩展至数万节点的图。在Cityscapes实例分割与ImageNet数据集聚类上的实证结果验证了本方法的价值。