We articulate fundamental mismatches between technical methods for machine unlearning in Generative AI, and documented aspirations for broader impact that these methods could have for law and policy. These aspirations are both numerous and varied, motivated by issues that pertain to privacy, copyright, safety, and more. For example, unlearning is often invoked as a solution for removing the effects of targeted information from a generative-AI model's parameters, e.g., a particular individual's personal data or in-copyright expression of Spiderman that was included in the model's training data. Unlearning is also proposed as a way to prevent a model from generating targeted types of information in its outputs, e.g., generations that closely resemble a particular individual's data or reflect the concept of "Spiderman." Both of these goals--the targeted removal of information from a model and the targeted suppression of information from a model's outputs--present various technical and substantive challenges. We provide a framework for thinking rigorously about these challenges, which enables us to be clear about why unlearning is not a general-purpose solution for circumscribing generative-AI model behavior in service of broader positive impact. We aim for conceptual clarity and to encourage more thoughtful communication among machine learning (ML), law, and policy experts who seek to develop and apply technical methods for compliance with policy objectives.
翻译:我们阐述了生成式人工智能中机器学习遗忘技术方法与法律政策领域对其更广泛影响的期望之间存在根本性错配。这些期望既多样又广泛,涉及隐私、版权、安全等诸多议题。例如,遗忘技术常被视为从生成式AI模型参数中消除特定信息影响(如特定个体的个人数据或训练数据中包含的受版权保护的"蜘蛛侠"表达)的解决方案。该技术也被提议作为阻止模型在输出中生成特定类型信息(如与特定个体数据高度相似或反映"蜘蛛侠"概念的生成内容)的手段。这两个目标——从模型中定向移除信息与从模型输出中定向抑制信息——均面临诸多技术与实质挑战。我们提出了系统分析这些挑战的框架,从而明确阐释为何遗忘技术不能作为限定生成式AI模型行为以实现更广泛积极影响的通用解决方案。本文旨在提升概念清晰度,并促进机器学习、法律与政策领域专家之间进行更审慎的交流,以共同开发和应用符合政策目标的技术方法。