Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, at the cost of an acceptable loss of accuracy in the non-adversarial setting.
翻译:在测试时验证机器学习模型对抗逃避攻击的鲁棒性是一个重要的研究问题。然而,先前研究已证明,对于决策树集成而言,该问题是NP-难的,因此在特定输入上必然难以求解。本文确定了一类受限的决策树集成——称为大间距集成——其安全验证可在多项式时间内完成。我们进而提出一种名为可验证学习的新方法,主张训练此类易于高效验证的受限模型类别。通过设计一种新的训练算法,该算法能从带标签数据中自动学习大间距决策树集成,从而在多项式时间内实现安全验证,我们展示了这一思路的优势。在公开数据集上的实验结果证实,使用标准商用硬件,采用我们的算法训练得到的大间距集成可在数秒内完成验证。此外,大间距集成比传统集成对战逃避攻击更具鲁棒性,代价是在非对抗环境下仅可接受的精度损失。