Differential privacy changes the effective sample size governing CVaR learning. For tail mass $τ$, the privacy-relevant sample size is not $n$, but $nτ$; equivalently, the effective private tail sample size is $εnτ$. Private CVaR excess risk decomposes into ordinary tail-risk statistical error and a privacy price. This decomposition is complete for scalar estimation and finite classes: scalar estimation has rate $Θ(B \min\{1,(nτ)^{-1/2}+(εnτ)^{-1}\})$, and finite classes of size $M$ have rate $Θ(B \min\{1,\sqrt{\log(2M)/(nτ)}+\log(2M)/(εnτ)\})$. These complete rates hold under pure DP, and their lower bounds extend to approximate DP in the stated small-$δ$ regimes. For convex Lipschitz learning, modular upper and lower reductions show that the CVaR-specific privacy term necessarily scales as $1/(εnτ)$, with dimension dependence inherited from private stochastic convex optimization. Together, these results identify ordinary private learning on $Θ(nτ)$ informative tail records as the canonical hard subproblem inside private CVaR learning.
翻译:差分隐私改变了控制CVaR学习的有效样本量。对于尾部质量 $τ$,隐私相关的样本量并非 $n$,而是 $nτ$;等价地,有效私有尾部样本量为 $εnτ$。私有CVaR超额风险可分解为普通尾部风险统计误差与隐私代价。这种分解对标量估计和有限类是完全的:标量估计的速率为 $Θ(B \min\{1,(nτ)^{-1/2}+(εnτ)^{-1}\})$,规模为 $M$ 的有限类的速率为 $Θ(B \min\{1,\sqrt{\log(2M)/(nτ)}+\log(2M)/(εnτ)\})$。这些完备速率在纯DP下成立,其下界在指定的小 $δ$ 区域内可推广到近似DP。对于凸Lipschitz学习,模块化上下界归约表明,CVaR特定的隐私项必然按 $1/(εnτ)$ 缩放,其维度依赖继承自私有随机凸优化。综上,这些结果将作用于 $Θ(nτ)$ 个信息性尾部记录的普通私有学习确定为私有CVaR学习中的典型困难子问题。