Sponge examples are test-time inputs carefully optimized to increase energy consumption and latency of neural networks when deployed on hardware accelerators. In this work, we are the first to demonstrate that sponge examples can also be injected at training time, via an attack that we call sponge poisoning. This attack allows one to increase the energy consumption and latency of machine-learning models indiscriminately on each test-time input. We present a novel formalization for sponge poisoning, overcoming the limitations related to the optimization of test-time sponge examples, and show that this attack is possible even if the attacker only controls a few model updates; for instance, if model training is outsourced to an untrusted third-party or distributed via federated learning. Our extensive experimental analysis shows that sponge poisoning can almost completely vanish the effect of hardware accelerators. We also analyze the activations of poisoned models, identifying which components are more vulnerable to this attack. Finally, we examine the feasibility of countermeasures against sponge poisoning to decrease energy consumption, showing that sanitization methods may be overly expensive for most of the users.
翻译:海绵样本是针对硬件加速器部署的神经网络,在测试阶段精心优化以增加其能耗和延迟的输入。在本工作中,我们首次证明海绵样本也可在训练阶段通过一种名为"海绵投毒"的攻击方式注入。该攻击能使机器学习模型在每个测试输入上无差别地增加能耗与延迟。我们提出了一种新颖的海绵投毒形式化方法,克服了测试阶段海绵样本优化相关的局限性,并证明即使攻击者仅控制少量模型更新——例如模型训练外包给不可信第三方或通过联邦学习分发——这种攻击依然可行。广泛的实验分析表明,海绵投毒几乎完全抵消了硬件加速器的加速效果。我们还分析了被投毒模型的激活特征,识别出更易受此类攻击影响的组件。最后,我们考察了反制海绵投毒措施(以降低能耗)的可行性,结果显示清洗方法对大多数用户而言成本过高。