We study indiscriminate poisoning for linear learners where an adversary injects a few crafted examples into the training data with the goal of forcing the induced model to incur higher test error. Inspired by the observation that linear learners on some datasets are able to resist the best known attacks even without any defenses, we further investigate whether datasets can be inherently robust to indiscriminate poisoning attacks for linear learners. For theoretical Gaussian distributions, we rigorously characterize the behavior of an optimal poisoning attack, defined as the poisoning strategy that attains the maximum risk of the induced model at a given poisoning budget. Our results prove that linear learners can indeed be robust to indiscriminate poisoning if the class-wise data distributions are well-separated with low variance and the size of the constraint set containing all permissible poisoning points is also small. These findings largely explain the drastic variation in empirical attack performance of the state-of-the-art poisoning attacks on linear learners across benchmark datasets, making an important initial step towards understanding the underlying reasons some learning tasks are vulnerable to data poisoning attacks.
翻译:我们研究了针对线性学习器的无差别投毒问题,其中攻击者向训练数据中注入少量精心构造的样本,旨在迫使训练得到的模型产生更高的测试误差。受观察到某些数据集上的线性学习器即便在没有防御措施的情况下也能抵抗已知最优攻击这一现象的启发,我们进一步探究数据集本身是否能够对线性学习器的无差别投毒攻击具有固有鲁棒性。针对理论高斯分布,我们严格刻画了最优投毒攻击的行为——即在给定投毒预算下使诱导模型风险最大化的投毒策略。结果表明,若类别条件数据分布具有良好分离性且方差较低,同时允许投毒点的约束集规模较小,则线性学习器确实能够抵抗无差别投毒。这些发现很大程度上解释了当前最优投毒攻击在基准数据集上对线性学习器的实证攻击性能存在显著差异的现象,为理解某些学习任务易受数据投毒攻击的深层原因迈出了重要一步。