Patch robustness certification is an emerging kind of defense technique against adversarial patch attacks with provable guarantees. There are two research lines: certified recovery and certified detection. They aim to label malicious samples with provable guarantees correctly and issue warnings for malicious samples predicted to non-benign labels with provable guarantees, respectively. However, existing certified detection defenders suffer from protecting labels subject to manipulation, and existing certified recovery defenders cannot systematically warn samples about their labels. A certified defense that simultaneously offers robust labels and systematic warning protection against patch attacks is desirable. This paper proposes a novel certified defense technique called CrossCert. CrossCert formulates a novel approach by cross-checking two certified recovery defenders to provide unwavering certification and detection certification. Unwavering certification ensures that a certified sample, when subjected to a patched perturbation, will always be returned with a benign label without triggering any warnings with a provable guarantee. To our knowledge, CrossCert is the first certified detection technique to offer this guarantee. Our experiments show that, with a slightly lower performance than ViP and comparable performance with PatchCensor in terms of detection certification, CrossCert certifies a significant proportion of samples with the guarantee of unwavering certification.
翻译:补丁鲁棒性认证是一种新兴的可证明防御技术,用于对抗补丁攻击。该领域存在两条研究主线:认证恢复与认证检测。前者旨在以可证明保证正确标记恶意样本,后者则针对被预测为非良性标签的恶意样本发出可证明的警告。然而,现有认证检测防御方法无法保护易受操纵的标签,而现有认证恢复方法无法对样本标签提供系统性警告。一种能同时提供鲁棒标签与系统性警告防护的认证防御方法有待开发。本文提出一种名为CrossCert的新型认证防御技术。CrossCert通过交叉验证两个认证恢复防御器,构建了一种可同时提供"坚定认证"与"检测认证"的新方法。坚定认证确保:当被认证样本遭受补丁扰动时,系统必能以可证明保证返回良性标签且不触发任何警告。据我们所知,CrossCert是首个提供该保证的认证检测技术。实验表明,尽管CrossCert在检测认证性能上略逊于ViP、与PatchCensor相当,但能对显著比例的样本提供坚定认证保证。