With AI increasingly deployed in safety-critical systems, providing formal robustness guarantees for the underlying models is essential. Existing verification methods either rely on overly conservative approximations or incur prohibitive computational costs. For example, the use of lp-norm perturbations in video settings encodes the belief that the adversary can inject noise in every video frame. In practice, adversarial perturbations exhibit structured spatial and temporal correlations, constrained to lower-dimensional, semantically meaningful subspaces. In this work, we study robustness verification of 3D CNNs processing video and volumetric inputs, targeting applications in action recognition (UCF-101), autonomous driving (Udacity), and medical imaging (MedMNIST) exploiting realistic assumptions on adversarial strength by modelling them as spatio-temporal constraints - where the attacker can modify either a subset of frames or patches within a set of consecutive frames. We demonstrate that modelling realistic constraints enables tighter approximations. We introduce Spatio-Temporal Bound Propagation (STBP), a verification framework that computes an exact closed-form characterization of the first convolutional layer and propagates certified bounds through subsequent layers using scalable approximations. Computing the exact closed form provides the tightest bounds for the first convolutional layer. Thus, we utilise approximation methods in the remainder of the network. To spur further progress in this field, we propose ST-Bench, a verification benchmark for autonomous driving and activity recognition, to systematically evaluate verifiable robustness. Compared to existing verification-based approaches, STBP provides stronger robustness guarantees with significantly improved scalability, achieving 1.7x higher certified robust accuracy under identical perturbation budgets.
翻译:随着人工智能越来越多地部署于安全关键系统中,为其底层模型提供形式化的鲁棒性保证至关重要。现有验证方法要么依赖过于保守的近似,要么产生难以承受的计算成本。例如,在视频场景中使用lp-范数扰动编码了攻击者可在每个视频帧中注入噪声的假设。实际上,对抗性扰动表现出结构化的时空相关性,被约束在低维、语义有意义的子空间中。本文研究处理视频和体积输入的3D CNN的鲁棒性验证,针对动作识别(UCF-101)、自动驾驶(Udacity)和医学成像(MedMNIST)等应用,通过将对抗强度建模为时空约束——即攻击者仅能修改若干连续帧中的子集帧或帧内补丁——来利用现实假设。我们证明,对现实约束进行建模能够实现更紧致的近似。我们提出时空界限传播(STBP)验证框架,该框架计算第一卷积层的精确闭式表征,并通过可扩展的近似方法传播后续层的认证界。精确闭式的计算为第一卷积层提供了最紧致的界。因此,我们在网络其余部分采用近似方法。为推动该领域进一步发展,我们提出面向自动驾驶和活动识别的验证基准ST-Bench,用于系统评估可验证鲁棒性。与现有基于验证的方法相比,STBP在显著提升可扩展性的同时提供更强的鲁棒性保证,在相同扰动预算下实现了1.7倍更高的认证鲁棒准确率。