We study the problem of privately releasing an approximate minimum spanning tree (MST). Given a graph $G = (V, E, \vec{W})$ where $V$ is a set of $n$ vertices, $E$ is a set of $m$ undirected edges, and $ \vec{W} \in \mathbb{R}^{|E|} $ is an edge-weight vector, our goal is to publish an approximate MST under edge-weight differential privacy, as introduced by Sealfon in PODS 2016, where $V$ and $E$ are considered public and the weight vector is private. Our neighboring relation is $\ell_\infty$-distance on weights: for a sensitivity parameter $\Delta_\infty$, graphs $ G = (V, E, \vec{W}) $ and $ G' = (V, E, \vec{W}') $ are neighboring if $\|\vec{W}-\vec{W}'\|_\infty \leq \Delta_\infty$. Existing private MST algorithms face a trade-off, sacrificing either computational efficiency or accuracy. We show that it is possible to get the best of both worlds: With a suitable random perturbation of the input that does not suffice to make the weight vector private, the result of any non-private MST algorithm will be private and achieves a state-of-the-art error guarantee. Furthermore, by establishing a connection to Private Top-k Selection [Steinke and Ullman, FOCS '17], we give the first privacy-utility trade-off lower bound for MST under approximate differential privacy, demonstrating that the error magnitude, $\tilde{O}(n^{3/2})$, is optimal up to logarithmic factors. That is, our approach matches the time complexity of any non-private MST algorithm and at the same time achieves optimal error. We complement our theoretical treatment with experiments that confirm the practicality of our approach.
翻译:本研究探讨私有化发布近似最小生成树(MST)的问题。给定图$G = (V, E, \vec{W})$,其中$V$为包含$n$个顶点的集合,$E$为包含$m$条无向边的集合,$\vec{W} \in \mathbb{R}^{|E|}$为边权向量,我们的目标是在Sealfon于PODS 2016提出的边权差分隐私框架下发布近似MST——该框架将$V$和$E$视为公开信息,而权值向量作为隐私数据。我们采用权值的$\ell_\infty$距离定义邻接关系:对于敏感度参数$\Delta_\infty$,当$\|\vec{W}-\vec{W}'\|_\infty \leq \Delta_\infty$时,图$G = (V, E, \vec{W})$与$G' = (V, E, \vec{W}')$互为邻接图。现有私有MST算法面临计算效率与精度之间的权衡。我们证明通过适当的随机输入扰动(该扰动本身不足以使权值向量私有化),任何非私有MST算法的输出结果均能满足隐私保护要求,并获得当前最优的误差保证。进一步地,通过建立与私有Top-k选择问题[Steinke and Ullman, FOCS '17]的理论关联,我们首次给出了近似差分隐私下MST问题的隐私-效用权衡下界,证明$\tilde{O}(n^{3/2})$量级的误差在对数因子范围内是最优的。这意味着我们的方法在保持与非私有MST算法相同时间复杂度的同时,实现了理论最优误差。我们通过实验验证了该方法的实用性,从而完善了理论分析。