Finding Deviated Behaviors of the Compressed DNN Models for Image Classifications

from arxiv, This is the author version. The DOI of the published version is http://dx.doi.org/10.1145/3583564. Please see the full abstract in the pdf

Model compression can significantly reduce the sizes of deep neural network (DNN) models, and thus facilitates the dissemination of sophisticated, sizable DNN models, especially for their deployment on mobile or embedded devices. However, the prediction results of compressed models may deviate from those of their original models. To help developers thoroughly understand the impact of model compression, it is essential to test these models to find those deviated behaviors before dissemination. However, this is a non-trivial task because the architectures and gradients of compressed models are usually not available. To this end, we propose DFLARE, a novel, search-based, black-box testing technique to automatically find triggering inputs that result in deviated behaviors in image classification tasks. DFLARE iteratively applies a series of mutation operations to a given seed image, until a triggering input is found. For better efficacy and efficiency, DFLARE models the search problem as Markov Chains and leverages the Metropolis-Hasting algorithm to guide the selection of mutation operators in each iteration. Further, DFLARE utilizes a novel fitness function to prioritize the mutated inputs that either cause large differences between two models' outputs, or trigger previously unobserved models' probability vectors. We evaluated DFLARE on 21 compressed models for image classification tasks with three datasets. The results show that DFLARE outperforms the baseline in terms of efficacy and efficiency. We also demonstrated that the triggering inputs found by DFLARE can be used to repair up to 48.48% deviated behaviors in image classification tasks and further decrease the effectiveness of DFLARE on the repaired models.

翻译：模型压缩可以显著减小深度神经网络模型的规模，从而促进复杂庞大DNN模型的传播，特别是其在移动或嵌入式设备上的部署。然而，压缩模型的预测结果可能与其原始模型存在偏差。为了帮助开发者全面理解模型压缩的影响，有必要在模型传播前对其进行测试，以发现这些异常行为。然而，这是一项具有挑战性的任务，因为压缩模型的架构和梯度通常不可用。为此，我们提出了DFLARE，一种新颖的基于搜索的黑盒测试技术，能够自动找出导致图像分类任务中异常行为的触发输入。DFLARE迭代地对给定种子图像应用一系列变异操作，直到找到触发输入。为了提高效果和效率，DFLARE将搜索问题建模为马尔可夫链，并利用Metropolis-Hasting算法指导每次迭代中变异算子的选择。此外，DFLARE采用一种新颖的适应度函数，优先选择那些能导致两个模型输出差异较大或触发先前未观测到的模型概率向量的变异输入。我们在三个数据集上对21个用于图像分类任务的压缩模型进行了评估。结果表明，DFLARE在效果和效率上均优于基线方法。我们还证明了DFLARE找到的触发输入可用于修复图像分类任务中高达48.48%的异常行为，并进一步降低DFLARE在修复后模型上的有效性。