Private closeness testing asks to decide whether the underlying probability distributions of two sensitive datasets are identical or differ significantly in statistical distance, while guaranteeing (differential) privacy of the data. As in most (if not all) distribution testing questions studied under privacy constraints, however, previous work assumes that the two datasets are equally sensitive, i.e., must be provided the same privacy guarantees. This is often an unrealistic assumption, as different sources of data come with different privacy requirements; as a result, known closeness testing algorithms might be unnecessarily conservative, ``paying'' too high a privacy budget for half of the data. In this work, we initiate the study of the closeness testing problem under heterogeneous privacy constraints, where the two datasets come with distinct privacy requirements.
翻译:私有接近性检验旨在判定两个敏感数据集的潜在概率分布在统计距离上是否相同或存在显著差异,同时确保数据的(差分)隐私性。然而,与隐私约束下研究的大多数(若非全部)分布检验问题相同,先前的研究假设两个数据集具有同等敏感性,即必须提供相同的隐私保障。这一假设往往不切实际,因为不同来源的数据具有不同的隐私需求;因此,已知的接近性检验算法可能过于保守,为半数数据支付了过高的隐私预算。本工作开创性地研究了异构隐私约束下的接近性检验问题,其中两个数据集具有不同的隐私要求。