We study the problem of Stochastic Convex Optimization (SCO) under the constraint of local Label Differential Privacy (L-LDP). In this setting, the features are considered public, but the corresponding labels are sensitive and must be randomized by each user locally before being sent to an untrusted analyzer. Prior work for SCO under L-LDP (Ghazi et al., 2021) established an excess population risk bound with a \emph{linear} dependence on the size of the label space, $K$: $O\left({\frac{K}{ε\sqrt{n}}}\right)$ in the high-privacy regime ($ε\leq 1$) and $O\left({\frac{K}{e^ε \sqrt{n}}}\right)$ in the medium-privacy regime ($1 \leq ε\leq \ln K$). This left open whether this linear cost is fundamental to the L-LDP model. In this note, we resolve this question. First, we present a novel and efficient non-interactive L-LDP algorithm that achieves an excess risk of $O\left({\sqrt{\frac{K}{εn}}}\right)$ in the high-privacy regime ($ε\leq 1$) and $O\left({\sqrt{\frac{K}{e^ε n}}}\right)$ in the medium-privacy regime ($1 \leq ε\leq \ln K$). This quadratically improves the dependency on the label space size from $O(K)$ to $O(\sqrt{K})$. Second, we prove a matching information-theoretic lower bound across all privacy regimes for any sufficiently large $n$.
翻译:研究局部标签差分隐私(L-LDP)约束下的随机凸优化问题。在该设定中,特征被视为公开信息,但对应的标签属于敏感数据,必须由每个用户本地随机化后发送至不可信的分析器。先前关于L-LDP下随机凸优化的研究(Ghazi等人,2021)建立了过总体风险界,其与标签空间大小$K$呈线性关系:高隐私体制($ε\leq 1$)下为$O\left({\frac{K}{ε\sqrt{n}}}\right)$,中等隐私体制($1 \leq ε\leq \ln K$)下为$O\left({\frac{K}{e^ε \sqrt{n}}}\right)$。这提出了该线性代价是否为L-LDP模型本质特征的问题。本文解决了这一疑问。首先,我们提出一种新颖高效的非交互式L-LDP算法,在高隐私体制($ε\leq 1$)下达到$O\left({\sqrt{\frac{K}{εn}}}\right)$的过风险,在中等隐私体制($1 \leq ε\leq \ln K$)下达到$O\left({\sqrt{\frac{K}{e^ε n}}}\right)$的过风险。该结果将标签空间大小的依赖从$O(K)$二次改进为$O(\sqrt{K})$。其次,我们证明了对所有隐私体制中任意充分大的$n$均成立的信息论下界。