In this work, we investigate the problem of public data assisted non-interactive Local Differentially Private (LDP) learning with a focus on non-parametric classification. Under the posterior drift assumption, we for the first time derive the mini-max optimal convergence rate with LDP constraint. Then, we present a novel approach, the locally differentially private classification tree, which attains the mini-max optimal convergence rate. Furthermore, we design a data-driven pruning procedure that avoids parameter tuning and provides a fast converging estimator. Comprehensive experiments conducted on synthetic and real data sets show the superior performance of our proposed methods. Both our theoretical and experimental findings demonstrate the effectiveness of public data compared to private data, which leads to practical suggestions for prioritizing non-private data collection.
翻译:在本研究中,我们探讨了公共数据辅助的非交互式本地差分隐私学习问题,重点关注非参数分类。在后验漂移假设下,我们首次推导出具有LDP约束的极小极大最优收敛速率。随后,我们提出了一种创新方法——本地差分隐私分类树,该方法能够达到极小极大最优收敛速率。此外,我们设计了一种数据驱动的剪枝流程,该流程无需参数调优即可提供快速收敛的估计器。在合成数据集和真实数据集上进行的综合实验表明,我们所提方法具有优越性能。我们的理论与实验结果均证明了公共数据相较于私有数据的有效性,这为优先收集非隐私数据的实践提供了建议。