Machine learning systems increasingly face requirements to remove entire domains of information--such as toxic language or biases--rather than individual user data. This task presents a dilemma: full removal of the unwanted domain data is computationally expensive, while random partial removal is statistically inefficient. We find that a domain's statistical influence is often concentrated in a small subset of its data samples, suggesting a path between ineffective partial removal and unnecessary complete removal. We formalize this as distributional unlearning: a framework to select a small subset that balances forgetting an unwanted distribution while preserving a desired one. Using Kullback-Leibler divergence constraints, we derive the exact removal-preservation Pareto frontier for Gaussian distributions and prove that models trained on the edited data achieve corresponding log-loss bounds. We propose a distance-based selection algorithm and show it is quadratically more sample-efficient than random removal in the challenging low-divergence regime. Experiments across synthetic, text, and image datasets (Jigsaw, CIFAR-10, SMS spam) show our method requires 15-82% less deletion than full removal for strong unlearning effects, e.g., halving initial forget set accuracy. Ultimately, by showing a small forget set often suffices, our framework lays the foundations for more scalable and rigorous subpopulation unlearning.
翻译:机器学习系统日益面临移除整个信息领域(如有毒语言或偏见)而非单个用户数据的要求。这一任务呈现出一个困境:完全移除不需要的领域数据计算成本高昂,而随机部分移除则统计效率低下。我们发现,一个领域的统计影响力通常集中在其数据样本的一个小子集中,这为无效的部分移除与不必要的完全移除之间提供了一条可行路径。我们将此形式化为分布式遗忘:一个选择小子集的框架,旨在平衡遗忘不需要的分布与保留期望分布。利用Kullback-Leibler散度约束,我们推导了高斯分布下精确的移除-保留帕累托前沿,并证明在编辑后数据上训练的模型能达到相应的对数损失界。我们提出了一种基于距离的选择算法,并证明其在具有挑战性的低散度区域中,样本效率比随机移除高出平方级。在合成、文本和图像数据集(Jigsaw、CIFAR-10、短信垃圾邮件)上的实验表明,我们的方法在实现强遗忘效果(例如将初始遗忘集准确率减半)时,比完全移除需要少15-82%的删除量。最终,通过证明一个小遗忘集通常已足够,我们的框架为更具可扩展性和严谨性的子群体遗忘奠定了基础。