Intensive algorithmic efforts have been made to enable the rapid improvements of certificated robustness for complex ML models recently. However, current robustness certification methods are only able to certify under a limited perturbation radius. Given that existing pure data-driven statistical approaches have reached a bottleneck, in this paper, we propose to integrate statistical ML models with knowledge (expressed as logical rules) as a reasoning component using Markov logic networks (MLN, so as to further improve the overall certified robustness. This opens new research questions about certifying the robustness of such a paradigm, especially the reasoning component (e.g., MLN). As the first step towards understanding these questions, we first prove that the computational complexity of certifying the robustness of MLN is #P-hard. Guided by this hardness result, we then derive the first certified robustness bound for MLN by carefully analyzing different model regimes. Finally, we conduct extensive experiments on five datasets including both high-dimensional images and natural language texts, and we show that the certified robustness with knowledge-based logical reasoning indeed significantly outperforms that of the state-of-the-arts.
翻译:近年来,针对复杂机器学习模型的认证鲁棒性快速提升,研究者们已投入大量算法性努力。然而,当前鲁棒性认证方法仅能在有限扰动半径下实现认证。鉴于现有纯数据驱动的统计方法已遭遇瓶颈,本文提出将统计机器学习模型与以逻辑规则形式表达的知识相结合,利用马尔可夫逻辑网络(MLN)作为推理组件,以进一步改善整体认证鲁棒性。这开启了关于此类范式(特别是推理组件如MLN)鲁棒性认证的新研究问题。作为理解这些问题的第一步,我们首先证明MLN鲁棒性认证的计算复杂度为#P-困难。基于这一硬度结果,我们通过细致分析不同模型机制推导出首个MLN认证鲁棒性边界。最后,我们在涵盖高维图像与自然语言文本的五类数据集上开展广泛实验,结果表明基于知识逻辑推理的认证鲁棒性显著超越现有最优方法。