Integer data is typically made differentially private by adding noise from a Discrete Laplace (or Discrete Gaussian) distribution. We study the setting where differential privacy of a counting query is achieved using bit-wise randomized response, i.e., independent, random bit flips on the encoding of the query answer. Binary error-correcting codes transmitted through noisy channels with independent bit flips are well-studied in information theory. However, such codes are unsuitable for differential privacy since they have (by design) high sensitivity, i.e., neighbouring integers have encodings with a large Hamming distance. Gray codes show that it is possible to create an efficient sensitivity 1 encoding, but are also not suitable for differential privacy due to lack of noise-robustness. Our main result is that it is possible, with a constant rate code, to simultaneously achieve the sensitivity of Gray codes and the noise-robustness of error-correcting codes (down to the noise level required for differential privacy). An application of this new encoding of the integers is an asymptotically faster, space-optimal differentially private data structure for histograms.
翻译:整数数据通常通过添加离散拉普拉斯(或离散高斯)分布的噪声来实现差分隐私。本文研究通过对查询答案的编码进行逐位随机响应(即独立的随机比特翻转)来实现计数查询差分隐私的场景。信息论中已深入研究了通过独立比特翻转噪声信道传输的二进制纠错码。然而,此类编码因具有高敏感度(即相邻整数的编码具有较大汉明距离)而不适用于差分隐私。格雷码虽能实现敏感度为1的高效编码,但因缺乏噪声鲁棒性同样不适用于差分隐私。我们的主要结论是:采用恒定速率编码,可以同时实现格雷码的敏感度特性和纠错码的噪声鲁棒性(直至满足差分隐私所需的噪声水平)。这种新型整数编码的一个应用是构建渐进更快且空间最优的差分隐私直方图数据结构。