Deep neural networks (DNNs) play a crucial role in the field of machine learning, demonstrating state-of-the-art performance across various application domains. However, despite their success, DNN-based models may occasionally exhibit challenges with generalization, i.e., may fail to handle inputs that were not encountered during training. This limitation is a significant challenge when it comes to deploying deep learning for safety-critical tasks, as well as in real-world settings characterized by substantial variability. We introduce a novel approach for harnessing DNN verification technology to identify DNN-driven decision rules that exhibit robust generalization to previously unencountered input domains. Our method assesses generalization within an input domain by measuring the level of agreement between independently trained deep neural networks for inputs in this domain. We also efficiently realize our approach by using off-the-shelf DNN verification engines, and extensively evaluate it on both supervised and unsupervised DNN benchmarks, including a deep reinforcement learning (DRL) system for Internet congestion control -- demonstrating the applicability of our approach for real-world settings. Moreover, our research introduces a fresh objective for formal verification, offering the prospect of mitigating the challenges linked to deploying DNN-driven systems in real-world scenarios.
翻译:深度神经网络(DNN)在机器学习领域扮演着至关重要的角色,在各种应用领域中展现出最先进的性能。然而,尽管取得了成功,基于DNN的模型在泛化方面仍可能面临挑战,即可能无法处理训练期间未遇到的输入。当将深度学习部署于安全关键任务以及具有高度可变性的现实世界场景时,这一局限性构成了重大挑战。我们提出了一种新方法,利用DNN验证技术来识别对先前未遇到的输入领域表现出稳健泛化能力的DNN驱动决策规则。我们的方法通过测量在特定输入领域中独立训练的深度神经网络之间的一致性水平来评估该领域内的泛化能力。我们还通过使用现成的DNN验证引擎高效实现了该方法,并在监督和无监督的DNN基准测试上进行了广泛评估,包括一个用于互联网拥塞控制的深度强化学习(DRL)系统——这证明了我们方法在现实世界场景中的适用性。此外,我们的研究为形式化验证引入了一个新的目标,为缓解在现实世界场景中部署DNN驱动系统所面临的挑战提供了前景。