Program Repair by Fuzzing over Patch and Input Space

Fuzz testing (fuzzing) is a well-known method for exposing bugs/vulnerabilities in software systems. Popular fuzzers, such as AFL, use a biased random search over the domain of program inputs, where 100s or 1000s of inputs (test cases) are executed per second in order to expose bugs. If a bug is discovered, it can either be fixed manually by the developer or fixed automatically using an Automated Program Repair (APR) tool. Like fuzzing, many existing APR tools are search-based, but over the domain of patches rather than inputs. In this paper, we propose search-based program repair as patch-level fuzzing. The basic idea is to adapt a fuzzer (AFL) to fuzz over the patch space rather than the input space. Thus we use a patch-space fuzzer to explore a patch space, while using a traditional input level fuzzer to rule out patch candidates and help in patch selection. To improve the throughput, we propose a compilation-free patch validation methodology, where we execute the original (unpatched) program natively, then selectively interpret only the specific patched statements and expressions. Since this avoids (re)compilation, we show that compilation-free patch validation can achieve a similar throughput as input-level fuzzing (100s or 1000s of execs/sec). We show that patch-level fuzzing and input-level fuzzing can be combined, for a co-exploration of both spaces in order to find better quality patches. Such a collaboration between input-level fuzzing and patch-level fuzzing is then employed to search over candidate fix locations, as well as patch candidates in each fix location.

翻译：模糊测试（fuzzing）是一种公认的暴露软件系统缺陷/漏洞的方法。流行的模糊测试工具（如AFL）在程序输入空间中进行有偏随机搜索，每秒执行数百或数千个输入（测试用例）以暴露缺陷。若发现缺陷，可由开发人员手动修复，或借助自动程序修复（APR）工具自动修复。与模糊测试类似，许多现有APR工具采用基于搜索的方法，但搜索空间是补丁空间而非输入空间。本文提出将基于搜索的程序修复视为补丁级模糊测试。基本思路是将模糊测试工具（AFL）适配为在补丁空间而非输入空间进行模糊测试：通过补丁空间模糊器探索补丁空间，同时利用传统输入级模糊测试排除候选补丁并辅助补丁选择。为提升吞吐量，我们提出无编译的补丁验证方法——原生执行原始（未修补）程序，仅选择性解释执行被补丁修改的特定语句和表达式。由于避免（重新）编译，实验表明无编译补丁验证可实现与输入级模糊测试相当的吞吐量（每秒数百或数千次执行）。我们证明补丁级模糊测试与输入级模糊测试可联合进行，通过协同探索两个空间以发现更高质量的补丁。这种输入级与补丁级模糊测试的协作机制，被用于搜索候选修复位置及各修复位置中的候选补丁。