Vision-Language-Action (VLA) models are increasingly deployed on real robots, where each predicted action is executed and each failure carries a safety cost. They reach high success rates on clean inputs but collapse under small adversarial perturbations. A $16/255$ PGD attack on OpenVLA-7B drops LIBERO success from above $95\%$ to under $5\%$. Empirical defenses recover some robustness at a cost in clean accuracy, but the literature does not say whether the trade-off has a theoretical floor. We prove that it does. For any VLA policy with discrete actions, the sum of capability (mutual information between policy action and oracle action) and robustness (mutual information preserved under adversarial perturbation, net of trivial channel leakage) is upper-bounded by a policy-independent budget: task entropy plus adversarial channel capacity. The proof is two applications of the Data Processing Inequality plus MI non-negativity. The pixel-level bound is loose on current models ($\sim 10^3$ nats), but an encoder-specific corollary restricts the channel to the policy-relevant subspace, reducing the budget from $\sim 5{,}000$ to $\sim 31$ nats on OpenVLA; the policy already consumes $\sim 24\%$ of this tighter budget, leaving limited room for simultaneous robustness improvement. We validate the bound across $252$ closed-form Gaussian-VLA cells and $48$ OpenVLA-7B $\times$ LIBERO $\times$ PGD cells (zero violations). We propose encoder-specific slack as a normalized comparison axis for defense papers, and release all code, manifests, and results.
翻译:视觉-语言-动作(VLA)模型正越来越多地部署于真实机器人上,其每个预测动作均会被执行,而每次失败都会带来安全代价。这类模型在干净输入上能达到高成功率,但在微小对抗扰动下便会崩溃。针对OpenVLA-7B模型实施16/255的PGD攻击,会导致LIBERO任务成功率从超过95%骤降至5%以下。经验性防御手段虽能在牺牲干净准确率的情况下恢复部分鲁棒性,但现有文献尚未明确该权衡是否存在理论下界。我们证明该下界确实存在。对于任意采用离散动作的VLA策略,其能力(策略动作与理想动作之间的互信息)与鲁棒性(对抗扰动下净除平凡信道泄露后保留的互信息)之和受限于策略无关的预算:任务熵与对抗信道容量之和。该证明通过两次应用数据处理不等式及互信息非负性完成。在像素层面上,当前模型的边界较为宽松(约10³纳特),但基于编码器的推论将信道限制在策略相关子空间内,使OpenVLA模型的预算从约5,000纳特降至约31纳特;当前策略已消耗该更严格预算的约24%,导致同时提升鲁棒性的空间极为有限。我们通过252个闭式高斯VLA单元和48个OpenVLA-7B与LIBERO和PGD的组合单元(零越界)验证了该边界。我们提出将编码器剩余度作为防御研究的归一化比较基准,并开源全部代码、配置清单及结果。