QED: Scalable Verification of Hardware Memory Consistency

Memory consistency model (MCM) issues in out-of-order-issue microprocessor-based shared-memory systems are notoriously non-intuitive and a source of hardware design bugs. Prior hardware verification work is limited to in-order-issue processors, to proving the correctness only of some test cases, or to bounded verification that does not scale in practice beyond 7 instructions across all threads. Because cache coherence (i.e., write serialization and atomicity) and pipeline front-end verification and testing are well-studied, we focus on the memory ordering in an out-of-order-issue processor's load-store queue and the coherence interface between the core and global coherence. We propose QED based on the key notion of observability that any hardware reordering matters only if a forbidden value is produced. We argue that one needs to consider (1) only directly-ordered instruction pairs -- transitively non-redundant pairs connected by an edge in the MCM-imposed partial order -- and not all in-flight instructions, and (2) only the ordering of external events from other cores (e.g.,invalidations) but not the events' originating cores, achieving verification scalability in both the numbers of in-flight memory instructions and of cores. Exhaustively considering all pairs of instruction types and all types of external events intervening between each pair, QED attempts to restore any reordered instructions to an MCM-complaint order without changing the execution values, where failure indicates an MCM violation. Each instruction pair's exploration results in a decision tree of simple, narrowly-defined predicates to be evaluated against the RTL. In our experiments, we automatically generate the decision trees for SC, TSO, and RISC-V WMO, and illustrate automatable verification by evaluating a substantial predicate against BOOMv3 implementation of RISC-V WMO, leaving full automation to future work.

翻译：在基于乱序发射微处理器的共享内存系统中，内存一致性模型（MCM）问题因其高度反直觉的特性成为硬件设计缺陷的重要来源。现有硬件验证工作局限于顺序发射处理器、仅验证部分测试用例的正确性，或采用有界验证方法（实际应用中跨所有线程的指令数不超过7条）。鉴于缓存一致性（即写串行化与原子性）及流水线前端验证与测试已得到充分研究，本文聚焦于乱序发射处理器加载-存储队列中的内存排序问题，以及处理器核与全局一致性模块之间的接口。我们提出基于关键可观测性概念的QED方法——只有当硬件重排序产生了被禁止的值时，该重排序才具有实际影响。论证表明：（1）只需考虑MCM偏序约束下由边直接连接的指令对（即传递非冗余对），而非所有执行中指令；（2）仅需关注来自其他处理器核的外部事件（如失效操作）的排序，无需追踪事件发起核。由此实现执行中内存指令数量与处理器核数量的双重可扩展验证。通过穷举所有指令类型对及每对指令间可能介入的外部事件类型，QED尝试在不改变执行结果的前提下将被重排序的指令恢复至符合MCM规范的顺序，若恢复失败则表明存在MCM违规。每条指令对的探索生成由简单窄化谓词构成的决策树，该决策树可直接在RTL上求值。实验表明，我们已自动生成的SC、TSO及RISC-V WMO协议的决策树，并通过评估针对BOOMv3（RISC-V WMO实现）的实质性谓词验证了自动化验证能力，完整自动化方案留待未来研究。