Our research delves into the balance between maintaining privacy and preserving statistical accuracy when dealing with multivariate data that is subject to \textit{componentwise local differential privacy} (CLDP). With CLDP, each component of the private data is made public through a separate privacy channel. This allows for varying levels of privacy protection for different components or for the privatization of each component by different entities, each with their own distinct privacy policies. We develop general techniques for establishing minimax bounds that shed light on the statistical cost of privacy in this context, as a function of the privacy levels $\alpha_1, ... , \alpha_d$ of the $d$ components. We demonstrate the versatility and efficiency of these techniques by presenting various statistical applications. Specifically, we examine nonparametric density and covariance estimation under CLDP, providing upper and lower bounds that match up to constant factors, as well as an associated data-driven adaptive procedure. Furthermore, we quantify the probability of extracting sensitive information from one component by exploiting the fact that, on another component which may be correlated with the first, a smaller degree of privacy protection is guaranteed.
翻译:本研究深入探讨了在受分量级局部差分隐私约束的多元数据中,隐私保护与统计精度之间的平衡。在CLDP框架下,私有数据的每个分量通过独立的隐私通道公开,允许为不同分量设置不同的隐私保护级别,或由遵循各自隐私策略的不同实体对每个分量进行隐私化处理。我们开发了建立极小化最优界的通用技术,揭示了该情境下隐私的统计代价与d个分量的隐私水平α₁,...,α_d的函数关系。通过展示多种统计应用,我们验证了这些技术的普适性与高效性。具体而言,我们研究了CLDP下的非参数密度估计与协方差估计,提供了达到常数因子匹配的上下界,以及相应的数据自适应过程。此外,我们量化了通过利用在可能与目标分量相关但隐私保护程度较低的另一分量上获取敏感信息的概率。