MoCo: Fuzzing Deep Learning Libraries via Assembling Code

The rapidly developing deep learning (DL) techniques have been applied in software systems with various application scenarios. However, they could also pose new safety threats with potentially serious consequences, especially in safety-critical domains. DL libraries serve as the underlying foundation for DL systems, and bugs in them can have unpredictable impacts that directly affect the behaviors of DL systems. Previous research on fuzzing DL libraries still has limitations in the diversity of test inputs, the construction of test oracles, and the precision of detection. In this paper, we propose MoCo, a novel fuzzing testing method for DL libraries via assembling code. MoCo first disassembles the seed code file to obtain the template and code blocks, and then employs code block mutation operators (e.g., API replacement, random generation, and boundary checking) to generate more new code blocks adapted to the template. By inserting context-appropriate code blocks into the template step by step, MoCo can generate a tree of code files with intergenerational relations. According to the derivation relations in this tree and the applied mutation operators, we construct the test oracle based on the execution state consistency. Since the granularity of code assembly and mutation is controlled rather than randomly divergent, we can quickly pinpoint the lines of code where the bugs are located and the corresponding triggering conditions. We conduct a comprehensive experiment to evaluate the efficiency and effectiveness of MoCo using three widely-used DL libraries (i.e., TensorFlow, PyTorch, and Jittor). During the experiment, MoCo detects 64 new bugs of four types in three DL libraries, where 51 bugs have been confirmed, and 13 bugs have been fixed by developers.

翻译：快速发展的深度学习（DL）技术已应用于各类软件系统场景。然而，这些技术也可能带来具有潜在严重后果的新型安全威胁，尤其在安全关键领域。深度学习库作为DL系统的底层基础，其缺陷可能产生不可预测的影响，直接改变DL系统的行为。现有针对DL库的模糊测试研究在测试输入多样性、测试预言构建及检测精度方面仍存在局限。本文提出MoCo——一种通过代码组装对DL库进行模糊测试的新方法。MoCo首先反汇编种子代码文件以获取模板和代码块，继而采用代码块变异算子（如API替换、随机生成、边界检查）生成更多适配模板的新代码块。通过逐级向模板插入上下文适配的代码块，MoCo可生成具有代际关系的代码文件树。基于该树的派生关系及所应用的变异算子，我们根据执行状态一致性构建测试预言。由于代码组装与变异的粒度可控而非随机发散，我们能够快速定位缺陷所在代码行及相应触发条件。我们使用三个广泛应用的DL库（即TensorFlow、PyTorch、Jittor）开展全面实验以评估MoCo的效率与有效性。实验期间，MoCo在三个DL库中检测出四类64个新缺陷，其中51个缺陷已确认，13个缺陷已被开发者修复。