Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing

Deep Neural Networks~(DNNs) have been widely deployed in software to address various tasks~(e.g., autonomous driving, medical diagnosis). However, they could also produce incorrect behaviors that result in financial losses and even threaten human safety. To reveal the incorrect behaviors in DNN and repair them, DNN developers often collect rich unlabeled datasets from the natural world and label them to test the DNN models. However, properly labeling a large number of unlabeled datasets is a highly expensive and time-consuming task. To address the above-mentioned problem, we propose NSS, Neuron Sensitivity guided test case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets. NSS leverages the internal neuron's information induced by test cases to select valuable test cases, which have high confidence in causing the model to behave incorrectly. We evaluate NSS with four widely used datasets and four well-designed DNN models compared to SOTA baseline methods. The results show that NSS performs well in assessing the test cases' probability of fault triggering and model improvement capabilities. Specifically, compared with baseline approaches, NSS obtains a higher fault detection rate~(e.g., when selecting 5\% test case from the unlabeled dataset in MNIST \& LeNet1 experiment, NSS can obtain 81.8\% fault detection rate, 20\% higher than baselines).

翻译：深度神经网络（DNN）已广泛部署于各类软件以处理多种任务（如自动驾驶、医疗诊断）。然而，它们可能产生错误行为，导致经济损失甚至威胁人类安全。为揭示并修复DNN中的错误行为，开发者常从自然场景收集大量无标注数据集并人工标注以测试DNN模型。但大规模无标注数据集的标注工作成本高昂、耗时巨大。针对上述问题，我们提出NSS（神经元灵敏度引导的测试用例选择方法），通过从无标注数据集中筛选高价值测试用例来减少标注时间。NSS利用测试用例激发的神经元内部信息，选择具有高置信度引发模型错误行为的测试用例。我们使用四个广泛采用的数据集和四个精心设计的DNN模型，与当前最优基线方法进行对比评估。结果表明，NSS在评估测试用例的故障触发概率与模型改进能力方面表现优异。具体而言，与基线方法相比，NSS实现了更高的故障检测率（例如，在MNIST与LeNet1实验中，从无标注数据集仅选取5%测试用例时，NSS可获得81.8%的故障检测率，较基线方法提升20%）。

相关内容

CASES

关注 4

CASES：International Conference on Compilers, Architectures, and Synthesis for Embedded Systems。 Explanation：嵌入式系统编译器、体系结构和综合国际会议。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/cases/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日