Integer data is typically made differentially private by adding noise from a Discrete Laplace (or Discrete Gaussian) distribution. We study the setting where differential privacy of a counting query is achieved using bit-wise randomized response, i.e., independent, random bit flips on the encoding of the query answer. Binary error-correcting codes transmitted through noisy channels with independent bit flips are well-studied in information theory. However, such codes are unsuitable for differential privacy since they have (by design) high sensitivity, i.e., neighboring integers have encodings with a large Hamming distance. Gray codes show that it is possible to create an efficient sensitivity 1 encoding, but are also not suitable for differential privacy due to lack of noise-robustness. Our main result is that it is possible, with a constant rate code, to simultaneously achieve the sensitivity of Gray codes and the noise-robustness of error-correcting codes (down to the noise level required for differential privacy). An application of this new encoding of the integers is a faster, space-optimal differentially private data structure for histograms.
翻译:整数数据通常通过添加离散拉普拉斯(或离散高斯)分布噪声来实现差分隐私。我们研究利用逐位随机响应(即对查询答案编码进行独立的随机比特翻转)实现计数查询差分隐私的场景。信息论领域已对通过独立比特翻转噪声信道传输的二进制纠错码进行了深入研究。然而,此类编码因具有高敏感度(即相邻整数的编码间存在较大汉明距离)而不适用于差分隐私。格雷码表明可以构建高效敏感度为1的编码,但由于缺乏噪声鲁棒性,同样不适合差分隐私。我们的主要成果是:采用恒定速率编码,可以在格雷码的敏感度特性与纠错码的噪声鲁棒性(降至差分隐私所需的噪声水平)之间实现平衡。这种新型整数编码的应用之一是构建一种更快速、空间最优的差分隐私直方图数据结构。