Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on publicly available datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, while incurring in just a relatively small loss of accuracy in the non-adversarial setting.
翻译:在测试时验证机器学习模型对逃避攻击的鲁棒性是一个重要的研究问题。然而,已有研究表明,对于决策树集成模型而言,该问题属于NP难问题,因此在特定输入情况下必然难以求解。本文识别出一类受限的决策树集成模型——称为大间隔集成模型——该类模型存在一个多项式时间内运行的安全性验证算法。进而我们提出一种名为可验证学习的新方法,主张训练此类便于高效验证的受限模型类别。我们通过设计一种新训练算法来展示这一思想的优势,该算法能从带标签数据中自动学习出一个大间隔决策树集成模型,从而使其安全性验证可在多项式时间内完成。在公开数据集上的实验结果表明,使用我们的算法训练的大间隔集成模型,在标准商用硬件上仅需数秒即可完成验证。此外,与传统的集成模型相比,大间隔集成模型对逃避攻击具有更强的鲁棒性,同时在非对抗场景下仅牺牲相对较小的准确率。