In this work, we investigate the problem of public data-assisted non-interactive LDP (Local Differential Privacy) learning with a focus on non-parametric classification. Under the posterior drift assumption, we for the first time derive the mini-max optimal convergence rate with LDP constraint. Then, we present a novel approach, the locally private classification tree, which attains the mini-max optimal convergence rate. Furthermore, we design a data-driven pruning procedure that avoids parameter tuning and produces a fast converging estimator. Comprehensive experiments conducted on synthetic and real datasets show the superior performance of our proposed method. Both our theoretical and experimental findings demonstrate the effectiveness of public data compared to private data, which leads to practical suggestions for prioritizing non-private data collection.
翻译:本文研究了公共数据辅助的非交互式局部差分隐私(LDP)学习问题,重点关注非参数分类。在后验漂移假设下,我们首次推导了具有LDP约束的极小极大最优收敛速率。随后,我们提出了一种新颖的方法——局部隐私分类树,该方法能够达到极小极大最优收敛速率。此外,我们设计了一种数据驱动的剪枝过程,避免参数调优并生成快速收敛的估计量。在合成数据集和真实数据集上进行的全面实验表明,我们提出的方法具有优越性能。我们的理论和实验结果均证明了公共数据相较于私有数据的有效性,这为优先收集非私有数据提供了实用建议。