Releasing data once and for all under noninteractive Local Differential Privacy (LDP) enables complete data reusability, but the resulting noise may create bias in subsequent analyses. In this work, we leverage the Weierstrass transform to characterize this bias in binary classification. We prove that inverting this transform leads to a bias-correction method to compute unbiased estimates of nonlinear functions on examples released under LDP. We then build a novel stochastic gradient descent algorithm called Inverse Weierstrass Private SGD (IWP-SGD). It converges to the true population risk minimizer at a rate of $\mathcal{O}(1/n)$, with $n$ the number of examples. We empirically validate IWP-SGD on binary classification tasks using synthetic and real-world datasets.
翻译:在非交互式局部差分隐私(LDP)框架下一次性发布数据可实现数据的完全可复用性,但引入的噪声可能导致后续分析产生偏差。本研究利用魏尔斯特拉斯变换来刻画二元分类任务中的此类偏差。我们证明,对该变换进行逆运算可得到一种偏差校正方法,从而能够对LDP下发布的样本计算非线性函数的无偏估计。基于此,我们构建了一种新颖的随机梯度下降算法——逆魏尔斯特拉斯私有随机梯度下降(IWP-SGD)。该算法以$\mathcal{O}(1/n)$的速率收敛至真实总体风险最小化器,其中$n$为样本数量。我们通过合成数据集和真实数据集上的二元分类任务对IWP-SGD进行了实证验证。