We study efficient mechanisms for differentially private kernel density estimation (DP-KDE). Prior work for the Gaussian kernel described algorithms that run in time exponential in the number of dimensions $d$. This paper breaks the exponential barrier, and shows how the KDE can privately be approximated in time linear in $d$, making it feasible for high-dimensional data. We also present improved bounds for low-dimensional data. Our results are obtained through a general framework, which we term Locality Sensitive Quantization (LSQ), for constructing private KDE mechanisms where existing KDE approximation techniques can be applied. It lets us leverage several efficient non-private KDE methods -- like Random Fourier Features, the Fast Gauss Transform, and Locality Sensitive Hashing -- and ``privatize'' them in a black-box manner. Our experiments demonstrate that our resulting DP-KDE mechanisms are fast and accurate on large datasets in both high and low dimensions.
翻译:我们研究差分隐私核密度估计(DP-KDE)的高效机制。针对高斯核的先前工作描述了运行时间随维度 $d$ 呈指数增长的算法。本文突破了指数障碍,展示了如何在 $d$ 上以线性时间私有地近似KDE,使其适用于高维数据。我们还提出了针对低维数据的改进界。我们的结果通过一个通用框架获得,我们将其称为局部敏感量化(LSQ),用于构建可应用现有KDE近似技术的私有KDE机制。该框架使我们能够利用多种高效的非私有KDE方法(如随机傅里叶特征、快速高斯变换和局部敏感哈希),并以黑盒方式对其进行“私有化”。我们的实验表明,所得到的DP-KDE机制在高维和低维大数据集上均具有快速且准确的特点。