Patch robustness certification ensures no patch within a given bound on a sample can manipulate a deep learning model to predict a different label. However, existing techniques cannot certify samples that cannot meet their strict bars at the classifier or patch region levels. This paper proposes MajorCert. MajorCert firstly finds all possible label sets manipulatable by the same patch region on the same sample across the underlying classifiers, then enumerates their combinations element-wise, and finally checks whether the majority invariant of all these combinations is intact to certify samples.
翻译:补丁鲁棒性认证确保样本上给定范围内的任何补丁都无法操纵深度学习模型预测不同的标签。然而,现有技术无法认证那些在分类器或补丁区域层面无法满足其严格条件的样本。本文提出MajorCert。MajorCert首先找出同一样本上同一补丁区域在不同分类器下所有可能被操纵的标签集,然后按元素枚举这些标签集的组合,最后检查所有这些组合的多数不变性是否完好无损,从而认证样本。