Responsible use of data is an indispensable part of any machine learning (ML) implementation. ML developers must carefully collect and curate their datasets, and document their provenance. They must also make sure to respect intellectual property rights, preserve individual privacy, and use data in an ethical way. Over the past few years, ML models have significantly increased in size and complexity. These models require a very large amount of data and compute capacity to train, to the extent that any defects in the training corpus cannot be trivially remedied by retraining the model from scratch. Despite sophisticated controls on training data and a significant amount of effort dedicated to ensuring that training corpora are properly composed, the sheer volume of data required for the models makes it challenging to manually inspect each datum comprising a training corpus. One potential fix for training corpus data defects is model disgorgement -- the elimination of not just the improperly used data, but also the effects of improperly used data on any component of an ML model. Model disgorgement techniques can be used to address a wide range of issues, such as reducing bias or toxicity, increasing fidelity, and ensuring responsible usage of intellectual property. In this paper, we introduce a taxonomy of possible disgorgement methods that are applicable to modern ML systems. In particular, we investigate the meaning of "removing the effects" of data in the trained model in a way that does not require retraining from scratch.
翻译:负责任地使用数据是任何机器学习(ML)实现中不可或缺的组成部分。ML开发者必须仔细收集和整理数据集,并记录其来源。他们还必须确保尊重知识产权、保护个人隐私并以符合伦理的方式使用数据。在过去几年中,ML模型在规模和复杂性上显著增长。这些模型需要大量数据和计算能力进行训练,以至于训练语料中的任何缺陷都无法通过从头重新训练模型来简单修复。尽管对训练数据实施了精细控制,并投入大量努力确保训练语料库的合理构成,但模型所需的海量数据使得人工检查训练语料库中的每个数据点变得极具挑战性。针对训练语料数据缺陷的一种潜在修复方法是模型脱敏——不仅消除不当使用的数据,还要消除这些数据对ML模型任何组件产生的影响。模型脱敏技术可用于应对广泛问题,例如减少偏见或毒性、提高忠实度以及确保知识产权的负责任使用。在本文中,我们提出了一种适用于现代ML系统的脱敏方法分类体系。特别地,我们探讨了如何在无需从头重新训练的情况下,在已训练模型中实现“消除数据影响”的具体含义。