Understanding Variation in Subpopulation Susceptibility to Poisoning Attacks

Machine learning is susceptible to poisoning attacks, in which an attacker controls a small fraction of the training data and chooses that data with the goal of inducing some behavior unintended by the model developer in the trained model. We consider a realistic setting in which the adversary with the ability to insert a limited number of data points attempts to control the model's behavior on a specific subpopulation. Inspired by previous observations on disparate effectiveness of random label-flipping attacks on different subpopulations, we investigate the properties that can impact the effectiveness of state-of-the-art poisoning attacks against different subpopulations. For a family of 2-dimensional synthetic datasets, we empirically find that dataset separability plays a dominant role in subpopulation vulnerability for less separable datasets. However, well-separated datasets exhibit more dependence on individual subpopulation properties. We further discover that a crucial subpopulation property is captured by the difference in loss on the clean dataset between the clean model and a target model that misclassifies the subpopulation, and a subpopulation is much easier to attack if the loss difference is small. This property also generalizes to high-dimensional benchmark datasets. For the Adult benchmark dataset, we show that we can find semantically-meaningful subpopulation properties that are related to the susceptibilities of a selected group of subpopulations. The results in this paper are accompanied by a fully interactive web-based visualization of subpopulation poisoning attacks found at https://uvasrg.github.io/visualizing-poisoning

翻译：机器学习易受投毒攻击的影响，在此类攻击中，攻击者控制少量训练数据，并选择这些数据以诱导训练后的模型产生模型开发者未曾预期的某些行为。我们考虑一个现实场景：对手能够插入有限数量的数据点，并试图控制模型对特定子群体的行为。受之前关于随机标签翻转攻击对不同子群体产生差异性有效性的观察启发，我们研究了可能影响最先进投毒攻击针对不同子群体有效性的特性。对于一类二维合成数据集，我们通过实验发现，对于可分离性较低的数据集，数据集的可分离性在子群体脆弱性中起主导作用。然而，对于良好分离的数据集，其脆弱性更依赖于子群体的个体特性。我们进一步发现，一个关键的子群体特性由干净模型与错误分类该子群体的目标模型在干净数据上的损失差异所体现，且当损失差异较小时，该子群体更易遭受攻击。这一特性也适用于高维基准数据集。针对Adult基准数据集，我们证明可以找到与选定子群体易感性相关的具有语义意义的子群体特性。本文结果配有一个完全交互式的基于网络的子群体投毒攻击可视化工具，访问地址为：https://uvasrg.github.io/visualizing-poisoning

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/