Local/seeded clustering aims to find a compact cluster near the given starting instances. While most existing studies on graph clustering assume a discrete graph setting (i.e., unweighted, undirected graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the classic non-approximating Andersen-Chung-Lang (ACL) clustering algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of complex graphs, including weighted, directed, and self-looped graphs and hypergraphs with edge-dependent vertex weights. Specifically, by leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We prove that, under two mild conditions, both algorithms can identify a quadratically optimal cluster in terms of conductance. Additionally, we provide experiments to validate our theoretical findings. Our code is available at https://github.com/iDEA-iSAIL-Lab-UIUC/HyperACL.
翻译:局部/种子聚类旨在找出靠近给定起始实例的紧凑聚类。现有图聚类研究大多假设离散图设置(即无权、无向、无自环图),而现实世界中的图可能更为复杂。本文将经典的非近似Andersen-Chung-Lang(ACL)聚类算法从离散图推广至更广泛的复杂图,包括加权图、有向图、自环图以及具有边依赖顶点权重的超图,并将其二次最优性加以泛化。具体而言,通过利用PageRank,我们提出了两种算法:用于图的GeneralACL和用于超图的HyperACL。我们证明,在两种温和条件下,这两种算法均能基于电导率识别出二次最优聚类。此外,我们通过实验验证了理论发现。相关代码可在https://github.com/iDEA-iSAIL-Lab-UIUC/HyperACL获取。