We explore the trade-off between privacy and statistical utility in private two-sample testing under local differential privacy (LDP) for both multinomial and continuous data. We begin by addressing the multinomial case, where we introduce private permutation tests using practical privacy mechanisms such as Laplace, discrete Laplace, and Google's RAPPOR. We then extend our multinomial approach to continuous data via binning and study its uniform separation rates under LDP over H\"older and Besov smoothness classes. The proposed tests for both discrete and continuous cases rigorously control the type I error for any finite sample size, strictly adhere to LDP constraints, and achieve minimax separation rates under LDP. The attained minimax rates reveal inherent privacy-utility trade-offs that are unavoidable in private testing. To address scenarios with unknown smoothness parameters in density testing, we propose an adaptive test based on a Bonferroni-type approach that ensures robust performance without prior knowledge of the smoothness parameters. We validate our theoretical findings with extensive numerical experiments and demonstrate the practical relevance and effectiveness of our proposed methods.
翻译:本文研究了局部差分隐私(LDP)条件下,针对多项分布数据和连续数据的私有双样本检验中隐私性与统计效用之间的权衡关系。我们首先处理多项分布情形,引入了使用拉普拉斯机制、离散拉普拉斯机制以及谷歌RAPPOR等实用隐私机制的私有置换检验。随后,我们通过分箱方法将多项分布方案推广至连续数据,并研究了其在Hölder和Besov光滑性类上LDP条件下的均匀分离速率。所提出的针对离散与连续情形的检验方法,均能在任意有限样本量下严格控制第一类错误,严格遵守LDP约束,并达到LDP条件下的极小极大分离速率。所获得的极小大速率揭示了私有检验中不可避免的隐私-效用权衡本质。针对密度检验中光滑参数未知的情形,我们提出了一种基于Bonferroni型方法的自适应检验,该检验无需先验光滑参数知识即可确保稳健的性能。我们通过大量数值实验验证了理论结果,并证明了所提方法的实际相关性和有效性。