This paper studies how to achieve individual indistinguishability by pufferfish privacy in aggregated query to a multi-user system. It is assumed that each user reports realization of a random variable. We study how to calibrate Laplace noise, added to the query answer, to attain pufferfish privacy when user changes his/her reported data value, leaves the system and is replaced by another use with different randomness. Sufficient conditions are derived for all scenarios for attaining statistical indistinguishability on four sets of secret pairs. They are derived using the existing Kantorovich method (Wasserstain metric of order $1$). These results can be applied to attain indistinguishability when a certain class of users is added or removed from a tabular data. It is revealed that attaining indifference in individual's data is conditioned on the statistics of this user only. For binary (Bernoulli distributed) random variables, the derived sufficient conditions can be further relaxed to reduce the noise and improve data utility.
翻译:本文研究如何在多用户系统的聚合查询中通过河豚隐私实现个体不可区分性。假设每个用户报告随机变量的实现值。我们研究了当用户改变其报告的数据值、离开系统并被具有不同随机性的另一用户替代时,如何校准添加到查询答案中的拉普拉斯噪声以实现河豚隐私。针对所有场景推导了在四组秘密对上实现统计不可区分性的充分条件。这些条件是利用现有的坎托罗维奇方法(一阶Wasserstein度量$W_1$)推导得出的。这些结果可应用于在表格数据中添加或移除特定类别用户时实现不可区分性。研究揭示,实现个体数据不可区分性仅取决于该用户的统计特性。对于二元(伯努利分布)随机变量,推导的充分条件可进一步放宽以降低噪声并提升数据效用。