Deep Neural Network (DNN) accelerators are extensively used to improve the computational efficiency of DNNs, but are prone to faults through Single-Event Upsets (SEUs). In this work, we present an in-depth analysis of the impact of SEUs on a Systolic Array (SA) based DNN accelerator. A fault injection campaign is performed through a Register-Transfer Level (RTL) based simulation environment to improve the observability of each hardware block, including the SA itself as well as the post-processing pipeline. From this analysis, we present the sensitivity, independent of a DNN model architecture, for various flip-flop groups both in terms of fault propagation probability and fault magnitude. This allows us to draw detailed conclusions and determine optimal mitigation strategies.
翻译:深度神经网络(DNN)加速器被广泛用于提升DNN的计算效率,但容易因单粒子翻转(SEU)而发生故障。本工作深入分析了SEU对一种基于脉动阵列(SA)的DNN加速器的影响。通过一个基于寄存器传输级(RTL)的仿真环境执行了故障注入实验,以提高包括SA本身及后处理流水线在内的每个硬件模块的可观测性。基于此分析,我们提出了独立于DNN模型架构的、针对不同触发器组的敏感性分析,涵盖了故障传播概率与故障幅度两个方面。这使我们能够得出详细结论并确定最优的缓解策略。