Models trained by minimizing the average loss often fail to be accurate on small or hard-to-learn groups of the data. Various methods address this issue by optimizing a weighted objective that focuses on the worst-performing groups. However, this approach becomes problematic when learning with differential privacy, as unequal data weighting can result in inhomogeneous privacy guarantees, in particular weaker privacy for minority groups. In this work, we introduce a new algorithm for differentially private worst-case group optimization called ASC (Adaptively Sampled and Clipped Worst-case Group Optimization). It adaptively controls both the sampling rate and the clipping threshold of each group. Thereby, it allows for harder-to-learn groups to be sampled more often while ensuring consistent privacy guarantees across all groups. Comparing ASC to prior work, we show that it results in lower-variance gradients, tighter privacy guarantees, and substantially higher worst-case group accuracy without sacrificing overall average accuracy.
翻译:通过最小化平均损失训练的模型通常难以在数据中的小型或难以学习的群体上保持准确性。现有方法通过优化加权目标函数来关注表现最差的群体,从而解决这一问题。然而,在差分隐私学习场景下,这种方法会引发新的问题:不均匀的数据加权可能导致隐私保护的不均衡性,特别是对少数群体的隐私保护会减弱。本文提出一种名为ASC(自适应采样与裁剪最坏情况群体优化)的差分隐私最坏情况群体优化算法。该算法自适应地控制每个群体的采样率和裁剪阈值,使得难以学习的群体能够获得更高的采样频率,同时确保所有群体具有一致的隐私保护强度。与现有方法相比,ASC能够产生更低方差的梯度、更严格的隐私保证,并在不牺牲整体平均准确率的前提下,显著提升最坏情况群体的准确率。