We propose Adaptive Randomized Smoothing (ARS) to certify the predictions of our test-time adaptive models against adversarial examples. ARS extends the analysis of randomized smoothing using $f$-Differential Privacy to certify the adaptive composition of multiple steps. For the first time, our theory covers the sound adaptive composition of general and high-dimensional functions of noisy inputs. We instantiate ARS on deep image classification to certify predictions against adversarial examples of bounded $L_{\infty}$ norm. In the $L_{\infty}$ threat model, ARS enables flexible adaptation through high-dimensional input-dependent masking. We design adaptivity benchmarks, based on CIFAR-10 and CelebA, and show that ARS improves standard test accuracy by $1$ to $15\%$ points. On ImageNet, ARS improves certified test accuracy by up to $1.6\%$ points over standard RS without adaptivity. Our code is available at https://github.com/ubc-systopia/adaptive-randomized-smoothing .
翻译:我们提出自适应随机平滑(ARS)方法,以认证我们测试时自适应模型在对抗样本下的预测结果。ARS利用$f$-差分隐私扩展了随机平滑的分析框架,从而能够认证多步自适应组合的安全性。我们的理论首次覆盖了含噪输入的一般性高维函数的安全自适应组合。我们将ARS实例化于深度图像分类任务,以认证模型在有限$L_{\infty}$范数对抗样本下的预测。在$L_{\infty}$威胁模型中,ARS通过高维输入依赖掩码实现灵活自适应。基于CIFAR-10和CelebA数据集,我们设计了自适应基准测试,结果表明ARS将标准测试准确率提升了$1$到$15$个百分点。在ImageNet数据集上,ARS相较于无自适应的标准随机平滑方法,将认证测试准确率最高提升了$1.6$个百分点。我们的代码公开于https://github.com/ubc-systopia/adaptive-randomized-smoothing。