DESIL: Detecting Silent Bugs in MLIR Compiler Infrastructure

MLIR (Multi-Level Intermediate Representation) compiler infrastructure provides an efficient framework for introducing a new abstraction level for programming languages and domain-specific languages. It has attracted widespread attention in recent years and has been applied in various domains, such as deep learning compiler construction. Recently, several MLIR compiler fuzzing techniques, such as MLIRSmith and MLIRod, have been proposed. However, none of them can detect silent bugs, i.e., bugs that incorrectly optimize code silently. The difficulty in detecting silent bugs arises from two main aspects: (1) UB-Free Program Generation: Ensures the generated programs are free from undefined behaviors to suit the non-UB assumptions required by compiler optimizations. (2) Lowering Support: Converts the given MLIR program into an executable form, enabling execution result comparisons, and selects a suitable lowering path for the program to reduce redundant lowering pass and improve the efficiency of fuzzing. To address the above issues, we propose DESIL. DESIL enables silent bug detection by defining a set of UB-elimination rules based on the MLIR documentation and applying them to input programs to produce UB-free MLIR programs. To convert dialects in MLIR program into the executable form, DESIL designs a lowering path optimization strategy to convert the dialects in given MLIR program into executable form. Furthermore, DESIL incorporates the differential testing for silent bug detection. To achieve this, it introduces an operation-aware optimization recommendation strategy into the compilation process to generate diverse executable files. We applied DESIL to the latest revisions of the MLIR compiler infrastructure. It detected 23 silent bugs and 19 crash bugs, of which 12/14 have been confirmed or fixed

翻译：MLIR（多级中间表示）编译器基础设施为编程语言和领域特定语言引入新的抽象级别提供了高效框架，近年来受到广泛关注，并已应用于深度学习编译器构建等多个领域。近期已提出多种MLIR编译器模糊测试技术，如MLIRSmith和MLIRod。然而，现有技术均无法检测静默错误，即那些在代码优化过程中不报错但产生错误结果的缺陷。检测静默错误主要面临两大挑战：(1) 无未定义行为程序生成：需确保生成程序不存在未定义行为，以满足编译器优化的非UB前提假设。(2) 降级支持：需将给定MLIR程序转换为可执行形式以实现执行结果比对，并为程序选择合适降级路径以减少冗余降级过程，提升模糊测试效率。为解决上述问题，我们提出DESIL方法。DESIL通过基于MLIR文档定义一套UB消除规则，并将其应用于输入程序以生成无UB的MLIR程序，从而实现静默错误检测。为将MLIR程序中的方言转换为可执行形式，DESIL设计了降级路径优化策略。此外，DESIL引入差分测试机制进行静默错误检测，通过在编译过程中采用操作感知的优化推荐策略来生成多样化的可执行文件。我们将DESIL应用于最新版MLIR编译器基础设施，共检测出23个静默错误和19个崩溃错误，其中12/14个错误已获确认或修复。