We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms either assume a trusted central curator or significantly degrade utility by adding noise in the local DP model. Naively combining the secure and central DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves five orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by designing a new DP clustering mechanism.
翻译:我们研究了水平联邦场景下的隐私保护$k$-均值聚类问题。现有的基于安全计算的联邦方法存在显著开销,且无法提供输出隐私。同时,差分隐私(DP)$k$-均值算法要么假设存在可信的中心聚合方,要么在本地DP模型中通过添加噪声导致效用显著下降。简单地将安全计算与中心DP方案结合会产生开销不切实际的协议。相反,我们的工作对DP和安全计算组件均进行了改进,从而设计出一种比先前工作更快、更隐私且更精确的方案。通过利用计算DP模型,我们设计了一种轻量级的、基于安全聚合的方法,相比最先进的相关工作实现了五个数量级的加速。此外,我们不仅保持了中心DP模型中最先进方案的效用,还通过设计一种新的DP聚类机制进一步提升了效用。