We study differentially private (DP) algorithms for recovering clusters in well-clustered graphs, which are graphs whose vertex set can be partitioned into a small number of sets, each inducing a subgraph of high inner conductance and small outer conductance. Such graphs have widespread application as a benchmark in the theoretical analysis of spectral clustering. We provide an efficient ($\epsilon$,$\delta$)-DP algorithm tailored specifically for such graphs. Our algorithm draws inspiration from the recent work of Chen et al., who developed DP algorithms for recovery of stochastic block models in cases where the graph comprises exactly two nearly-balanced clusters. Our algorithm works for well-clustered graphs with $k$ nearly-balanced clusters, and the misclassification ratio almost matches the one of the best-known non-private algorithms. We conduct experimental evaluations on datasets with known ground truth clusters to substantiate the prowess of our algorithm. We also show that any (pure) $\epsilon$-DP algorithm would result in substantial error.
翻译:我们研究了用于在良好聚类图中恢复聚类的差分隐私算法。此类图的顶点集可划分为少量子集,每个子集诱导的子图具有高内导率和低外导率。这类图在谱聚类的理论分析中作为基准具有广泛应用。我们提出了一种专为此类图设计的高效($\epsilon$,$\delta$)-差分隐私算法。该算法的灵感来源于Chen等人近期的工作——他们针对恰好包含两个近乎平衡聚类的图开发了用于恢复随机块模型的差分隐私算法。我们的算法适用于具有$k$个近乎平衡聚类的良好聚类图,其误分类率几乎与已知最优非隐私算法相匹配。我们在具有真实聚类标签的数据集上进行了实验评估,以验证算法的有效性。同时,我们还证明任何(纯)$\epsilon$-差分隐私算法均会导致显著误差。