Learning from demonstrations is a popular approach to train AI models; however, their vulnerability to adversarial attacks remains underexplored. We present the first systematic study of adversarial attacks, across a range of both classic and recently proposed imitation learning algorithms, including Vanilla Behavior Cloning (Vanilla BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and Vector-Quantized Behavior Transformer (VQ-BET). We study the vulnerability of these methods to both white-box, grey-box and black-box adversarial perturbations. Our experiments reveal that most existing methods are highly vulnerable to these attacks, including black-box transfer attacks that transfer across algorithms. To the best of our knowledge, we are the first to study and compare the vulnerabilities of different popular imitation learning algorithms to both white-box and black-box attacks. Our findings highlight the vulnerabilities of modern imitation learning algorithms, paving the way for future work in addressing such limitations. Videos and code are available at https://sites.google.com/view/uap-attacks-on-bc.
翻译:从示范中学习是训练AI模型的一种流行方法,然而这些模型对对抗性攻击的脆弱性尚未得到充分研究。我们首次对一系列经典及近期提出的模仿学习算法进行了对抗性攻击的系统性研究,包括原始行为克隆(Vanilla BC)、LSTM-GMM、隐式行为克隆(IBC)、扩散策略(DP)和向量量化行为Transformer(VQ-BET)。我们研究了这些方法在白盒、灰盒及黑盒对抗扰动下的脆弱性。实验表明,大多数现有方法对这些攻击高度脆弱,包括跨算法转移的黑盒攻击。据我们所知,我们是首个研究与比较不同流行模仿学习算法对白盒及黑盒攻击脆弱性的工作。我们的发现揭示了现代模仿学习算法的脆弱性,为未来解决此类局限性铺平了道路。视频和代码可在 https://sites.google.com/view/uap-attacks-on-bc 获取。