Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do not attempt to infer and exploit any knowledge of the system's past behaviours to guide the search for bugs. In this work, we present Mallory: the first framework for grey-box fuzz-testing of distributed systems. Unlike popular black-box distributed system fuzzers, such as Jepsen, that search for bugs by randomly injecting network partitions and node faults or by following human-defined schedules, Mallory is adaptive. It exercises a novel metric to learn how to maximize the number of observed system behaviors by choosing different sequences of faults, thus increasing the likelihood of finding new bugs. The key enablers for our approach are the new ideas of timeline-driven testing and timeline abstraction that provide the feedback function guiding a biased random search for failures. Mallory dynamically constructs Lamport timelines of the system behaviour, abstracts these timelines into happens-before summaries, and introduces faults guided by its real-time observation of the summaries. We have evaluated Mallory on a diverse set of widely-used industrial distributed systems. Compared to the start-of-the-art black-box fuzzer Jepsen, Mallory explores more behaviours and takes less time to find bugs. Mallory discovered 22 zero-day bugs (of which 18 were confirmed by developers), including 10 new vulnerabilities, in rigorously-tested distributed systems such as Braft, Dqlite, and Redis. 6 new CVEs have been assigned.
翻译:灰盒模糊测试是测试顺序程序中漏洞的首选轻量级方法。该方法通过利用观察到的测试执行反馈函数,对程序输入域进行有偏随机搜索,在效率与效果之间取得了平衡。然而,在分布式系统测试领域,当前实践仍停留在仅采用黑盒工具的层面,这类工具无法推断并利用系统历史行为知识来指导漏洞搜索。本研究提出Mallory:首个面向分布式系统的灰盒模糊测试框架。与Jepsen等主流黑盒分布式系统模糊器不同——后者通过随机注入网络分区和节点故障或遵循人工定义的计划来搜索漏洞——Mallory具备自适应性。它通过创新的度量标准学习如何选择不同故障序列以最大化观察到的系统行为数量,从而提高发现新漏洞的概率。该方法的核心支撑在于时间线驱动测试与时间线抽象两大新概念,它们为有偏随机搜索故障提供了反馈函数。Mallory动态构建系统行为的Lamport时间线,将这些时间线抽象为"先于发生"摘要,并根据对摘要的实时观测结果引入故障。我们在多种广泛使用的工业级分布式系统上对Mallory进行了评估。与当前最先进的黑盒模糊器Jepsen相比,Mallory探索了更多行为且发现漏洞耗时更短。在Braft、Dqlite、Redis等经过严格测试的分布式系统中,Mallory发现了22个零日漏洞(其中18个已被开发者确认),包括10个新安全漏洞,目前已分配6项新CVE编号。