It can be difficult for practitioners to interpret the quality of differentially private (DP) statistics due to the added noise. One method to help analysts understand the amount of error introduced by DP is to return a Randomization Interval (RI), along with the statistic. A RI is a type of confidence interval that bounds the error introduced by DP. For queries where the noise distribution depends on the input, such as the median, prior work degrades the quality of the median itself to obtain a high-quality RI. In this work, we propose PostRI, a solution to compute a RI after the median has been estimated. PostRI enables a median estimation with 14%-850% higher utility than related work, while maintaining a narrow RI.
翻译:对于实践者而言,由于添加噪声,解读差分隐私(DP)统计量的质量可能较为困难。为帮助分析师理解DP引入的误差大小,一种方法是返回统计量及随机化区间(RI)。RI是一种置信区间,用于界定DP引入的误差范围。对于噪声分布依赖于输入的查询(如中位数),以往研究需降低中位数本身的质量以获得高质量的RI。本研究提出PostRI方法,可在中位数估计完成后计算RI。PostRI使中位数估计的实用性比相关研究提高14%-850%,同时保持较窄的RI。