Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, at the cost of an acceptable loss of accuracy in the non-adversarial setting.
翻译:验证机器学习模型在测试时抵御规避攻击的鲁棒性是一个重要的研究问题。遗憾的是,先前的研究已证实,对于决策树集成而言,该问题是NP难的,因此对于特定输入必然难以处理。本文中,我们识别出一类受限的决策树集成,称为大间隔集成,这类集成允许在多项式时间内执行安全验证算法。随后,我们提出一种名为“可验证学习”的新方法,该方法倡导训练此类便于高效验证的受限模型类别。通过设计一种新的训练算法,从标注数据中自动学习大间隔决策树集成,从而在多项式时间内实现其安全验证,我们展示了这一思想的优势。在公开数据集上的实验结果证实,使用我们算法训练的大间隔集成可在标准商用硬件上于数秒内完成验证。此外,与传统集成相比,大间隔集成在规避攻击下具有更强的鲁棒性,其代价是在非对抗性场景中准确率略有可接受的损失。