The intrinsic complexity of deep neural networks (DNNs) makes it challenging to verify not only the networks themselves but also the hosting DNN-controlled systems. Reachability analysis of these systems faces the same challenge. Existing approaches rely on over-approximating DNNs using simpler polynomial models. However, they suffer from low efficiency and large overestimation, and are restricted to specific types of DNNs. This paper presents a novel abstraction-based approach to bypass the crux of over-approximating DNNs in reachability analysis. Specifically, we extend conventional DNNs by inserting an additional abstraction layer, which abstracts a real number to an interval for training. The inserted abstraction layer ensures that the values represented by an interval are indistinguishable to the network for both training and decision-making. Leveraging this, we devise the first black-box reachability analysis approach for DNN-controlled systems, where trained DNNs are only queried as black-box oracles for the actions on abstract states. Our approach is sound, tight, efficient, and agnostic to any DNN type and size. The experimental results on a wide range of benchmarks show that the DNNs trained by using our approach exhibit comparable performance, while the reachability analysis of the corresponding systems becomes more amenable with significant tightness and efficiency improvement over the state-of-the-art white-box approaches.
翻译:深度神经网络的固有复杂性使得验证网络本身及其所控制的系统都极具挑战性。这些系统的可达性分析面临同样难题。现有方法依赖于使用更简单的多项式模型对深度神经网络进行过近似,但存在效率低、过估计大且仅限于特定网络类型的问题。本文提出一种新颖的基于抽象的方法,可绕过可达性分析中深度神经网络过近似这一核心难题。具体而言,我们通过在传统深度神经网络中插入额外的抽象层,将实数抽象为区间进行训练。该抽象层确保网络在训练和决策过程中无法区分区间内不同数值所代表的值。基于此特性,我们首次提出面向深度神经网络控制系统的黑盒可达性分析方法:训练好的深度神经网络仅作为黑盒预言机,用于查询抽象状态上的动作。该方法具有可靠性、紧致性、高效性,且不依赖于深度神经网络的类型与规模。在广泛基准测试上的实验结果表明,使用本文方法训练的深度神经网络性能相当,但对应系统的可达性分析在紧致性和效率上均显著优于当前最先进的白盒方法。