The increasing use of deep neural networks (DNNs) in safety-critical systems has raised concerns about their potential for exhibiting ill-behaviors. While DNN verification and testing provide post hoc conclusions regarding unexpected behaviors, they do not prevent the erroneous behaviors from occurring. To address this issue, DNN repair/patch aims to eliminate unexpected predictions generated by defective DNNs. Two typical DNN repair paradigms are retraining and fine-tuning. However, existing methods focus on the high-level abstract interpretation or inference of state spaces, ignoring the underlying neurons' outputs. This renders patch processes computationally prohibitive and limited to piecewise linear (PWL) activation functions to great extent. To address these shortcomings, we propose a behavior-imitation based repair framework, BIRDNN, which integrates the two repair paradigms for the first time. BIRDNN corrects incorrect predictions of negative samples by imitating the closest expected behaviors of positive samples during the retraining repair procedure. For the fine-tuning repair process, BIRDNN analyzes the behavior differences of neurons on positive and negative samples to identify the most responsible neurons for the erroneous behaviors. To tackle more challenging domain-wise repair problems (DRPs), we synthesize BIRDNN with a domain behavior characterization technique to repair buggy DNNs in a probably approximated correct style. We also implement a prototype tool based on BIRDNN and evaluate it on ACAS Xu DNNs. Our experimental results show that BIRDNN can successfully repair buggy DNNs with significantly higher efficiency than state-of-the-art repair tools. Additionally, BIRDNN is highly compatible with different activation functions.
翻译:随着深度神经网络(DNN)在安全关键系统中的日益广泛应用,其潜在的不良行为引发了广泛关注。尽管DNN验证与测试能够对意外行为提供事后结论,但无法从根本上阻止错误行为的发生。为解决该问题,DNN修复/补丁技术旨在消除有缺陷DNN产生的意外预测。两种典型的DNN修复范式为重训练和微调。然而,现有方法侧重于高层抽象解释或状态空间推理,忽略了底层神经元输出。这导致修复过程计算代价高昂,且很大程度上局限于分段线性(PWL)激活函数。为克服这些缺陷,我们提出一种基于行为模仿的修复框架BIRDNN,该框架首次整合了上述两种修复范式。在重训练修复过程中,BIRDNN通过模仿正样本最接近的预期行为来纠正负样本的错误预测;在微调修复过程中,BIRDNN通过分析神经元在正负样本上的行为差异,识别导致错误行为的关键神经元。针对更具挑战性的领域修复问题(DRP),我们结合领域行为表征技术对BIRDNN进行综合优化,以可能接近正确的方式修复有缺陷的DNN。我们还基于BIRDNN实现了原型工具,并在ACAS Xu DNN上进行了评估。实验结果表明,相较于现有最先进的修复工具,BIRDNN能以显著更高的效率成功修复有缺陷的DNN。此外,BIRDNN对不同激活函数具有高度兼容性。