Reliability has been a major concern in embedded systems. Higher transistor density and lower voltage supply increase the vulnerability of embedded systems to soft errors. A Single Event Upset (SEU), which is also called a soft error, can reverse a bit in a sequential element, resulting in a system failure. Simulation-based fault injection has been widely used to evaluate reliability, as suggested by ISO26262. However, it is practically impossible to test all faults for a complex design. Random fault injection is a compromise that reduces accuracy and fault coverage. Formal verification is an alternative approach. In this paper, we use formal verification, in the form of model checking, to evaluate the hardware reliability of a RISC-V Ibex Core in the presence of soft errors. Backward tracing is performed to identify and categorize faults according to their effects (no effect, Silent Data Corruption, crashes, and hangs). By using formal verification, the entire state space and fault list can be exhaustively explored. It is found that misaligned instructions can amplify fault effects. It is also found that some bits are more vulnerable to SEUs than others. In general, most of the bits in the Ibex Core are vulnerable to Silent Data Corruption, and the second pipeline stage is more vulnerable to Silent Data Corruption than the first.
翻译:可靠性一直是嵌入式系统中的主要关注点。更高的晶体管密度和更低的供电电压增加了嵌入式系统对软错误的易感性。单粒子翻转(SEU),也称为软错误,可以翻转时序元件中的单个比特,从而导致系统故障。基于仿真的故障注入已被广泛用于评估可靠性,如ISO26262所建议。然而,对于复杂设计而言,测试所有故障实际上是不可能的。随机故障注入是一种折衷方案,但这会降低准确性和故障覆盖率。形式化验证是一种替代方法。本文采用模型检查形式的形式化验证,评估RISC-V Ibex内核在软错误存在时的硬件可靠性。通过反向追踪来识别故障并根据其影响(无影响、静默数据损坏、崩溃和挂起)进行分类。利用形式化验证,可以穷举探索整个状态空间和故障列表。研究发现,未对齐指令会放大故障影响。同时,某些比特位比其它比特位更易受SEU影响。总体而言,Ibex内核中的大多数比特位易受静默数据损坏影响,且第二流水线阶段比第一阶段更易受静默数据损坏。