To ensure resilient neural network processing on even unreliable hardware, comprehensive reliability analysis against various hardware faults is generally required before the deep neural network models are deployed, and efficient error injection tools are highly demanded. However, most existing fault injection tools remain rather limited to basic fault injection to neurons and fail to provide fine-grained vulnerability analysis capability. In addition, many of the fault injection tools still need to change the neural network models and make the fault injection closely coupled with normal neural network processing, which further complicates the use of the fault injection tools and slows down the fault simulation. In this work, we propose MRFI, a highly configurable multi-resolution fault injection tool for deep neural networks. It enables users to modify an independent fault configuration file rather than neural network models for the fault injection and vulnerability analysis. Particularly, it integrates extensive fault analysis functionalities from different perspectives and enables multi-resolution investigation of the vulnerability of neural networks. In addition, it does not modify the major neural network computing framework of PyTorch. Hence, it allows parallel processing on GPUs naturally and exhibits fast fault simulation according to our experiments.
翻译:为确保深度神经网络模型在不可靠硬件上的可靠处理,通常需要在部署前对各种硬件故障进行全面可靠性分析,因此高效的错误注入工具备受需求。然而,现有的大多数故障注入工具仍局限于对神经元进行基本故障注入,缺乏细粒度的脆弱性分析能力。此外,许多故障注入工具仍需修改神经网络模型,使故障注入与正常神经网络处理紧密耦合,进一步复杂化了工具的使用并拖慢了故障仿真速度。本文提出MRFI——一种高度可配置的深度神经网络多分辨率故障注入工具。它允许用户通过修改独立的故障配置文件而非神经网络模型来进行故障注入与脆弱性分析。特别地,该工具集成了从不同角度进行广泛故障分析的功能,支持神经网络脆弱性的多分辨率研究。同时,它不修改PyTorch主要神经网络计算框架,因此能自然支持GPU并行处理,实验表明其故障仿真速度快。