Certified Robustness to Data Poisoning in Gradient-Based Training

Modern machine learning pipelines leverage large amounts of public data, making it infeasible to guarantee data quality and leaving models open to poisoning and backdoor attacks. Provably bounding model behavior under such attacks remains an open problem. In this work, we address this challenge by developing the first framework providing provable guarantees on the behavior of models trained with potentially manipulated data without modifying the model or learning algorithm. In particular, our framework certifies robustness against untargeted and targeted poisoning, as well as backdoor attacks, for bounded and unbounded manipulations of the training inputs and labels. Our method leverages convex relaxations to over-approximate the set of all possible parameter updates for a given poisoning threat model, allowing us to bound the set of all reachable parameters for any gradient-based learning algorithm. Given this set of parameters, we provide bounds on worst-case behavior, including model performance and backdoor success rate. We demonstrate our approach on multiple real-world datasets from applications including energy consumption, medical imaging, and autonomous driving.

翻译：现代机器学习流程大量依赖公开数据，导致数据质量难以保障，使得模型易受投毒攻击和后门攻击。在遭受此类攻击时对模型行为提供可证明的边界仍是一个开放性问题。本研究通过开发首个无需修改模型或学习算法即可为使用潜在篡改数据训练的模型行为提供可证明保证的框架，解决了这一挑战。具体而言，针对训练输入和标签的有界及无界篡改，我们的框架可认证模型对无目标投毒、有目标投毒及后门攻击的鲁棒性。该方法利用凸松弛技术对给定投毒威胁模型下所有可能的参数更新集合进行过近似，从而能为任意基于梯度的学习算法界定所有可达参数集合。基于该参数集合，我们提供了最坏情况行为（包括模型性能和后门攻击成功率）的边界。我们通过能源消耗、医学影像及自动驾驶等多个实际应用数据集验证了该方法的有效性。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

深度学习中的数据投毒：综述

专知会员服务

30+阅读 · 2025年4月1日

【普林斯顿博士论文】在差分隐私机器学习中有效地从数据中学习和生成数据

专知会员服务

16+阅读 · 2024年10月7日

【MIT博士论文】高效的鲁棒性和可解释性在学习和数据驱动决策中的应用

专知会员服务

48+阅读 · 2024年7月21日

【NTU博士论文】开放世界中机器学习的自然鲁棒性，175页pdf

专知会员服务

34+阅读 · 2023年12月24日