Gradient attacks and data poisoning tamper with the training of machine learning algorithms to maliciously alter them and have been proven to be equivalent in convex settings. The extent of harm these attacks can produce in non-convex settings is still to be determined. Gradient attacks can affect far less systems than data poisoning but have been argued to be more harmful since they can be arbitrary, whereas data poisoning reduces the attacker's power to only being able to inject data points to training sets, via e.g. legitimate participation in a collaborative dataset. This raises the question of whether the harm made by gradient attacks can be matched by data poisoning in non-convex settings. In this work, we provide a positive answer in a worst-case scenario and show how data poisoning can mimic a gradient attack to perform an availability attack on (non-convex) neural networks. Through gradient inversion, commonly used to reconstruct data points from actual gradients, we show how reconstructing data points out of malicious gradients can be sufficient to perform a range of attacks. This allows us to show, for the first time, an availability attack on neural networks through data poisoning, that degrades the model's performances to random-level through a minority (as low as 1%) of poisoned points.
翻译:梯度攻击与数据投毒通过篡改机器学习算法的训练过程实现恶意修改,并已在凸优化场景中被证明具有等效性。这些攻击在非凸场景中可能造成的危害程度仍有待确定。梯度攻击可影响的系统范围远小于数据投毒,但因其可任意构造而被认为更具危害性;而数据投毒仅允许攻击者通过合法参与协作数据集等方式向训练集注入数据点,从而限制了攻击能力。这引发了一个问题:在非凸场景中,数据投毒能否达到与梯度攻击同等的危害程度?本研究在最坏情况下给出了肯定答案,并展示了数据投毒如何通过模拟梯度攻击对(非凸)神经网络实施可用性攻击。通过常用于从真实梯度重建数据点的梯度反演技术,我们证明了利用恶意梯度重构数据点足以实施多种攻击。这使我们首次实现了通过数据投毒对神经网络的可用性攻击——仅需少量(低至1%)投毒点即可将模型性能降低至随机水平。