Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do not attempt to infer and exploit any knowledge of the system's past behaviours to guide the search for bugs. In this work, we present Mallory: the first framework for grey-box fuzz-testing of distributed systems. Unlike popular black-box distributed system fuzzers, such as Jepsen, that search for bugs by randomly injecting network partitions and node faults or by following human-defined schedules, Mallory is adaptive. It exercises a novel metric to learn how to maximize the number of observed system behaviors by choosing different sequences of faults, thus increasing the likelihood of finding new bugs. The key enablers for our approach are the new ideas of timeline-driven testing and timeline abstraction that provide the feedback function guiding a biased random search for failures. Mallory dynamically constructs Lamport timelines of the system behaviour, abstracts these timelines into happens-before summaries, and introduces faults guided by its real-time observation of the summaries. We have evaluated Mallory on a diverse set of widely-used industrial distributed systems. Compared to the start-of-the-art black-box fuzzer Jepsen, Mallory explores more behaviours and takes less time to find bugs. Mallory discovered 22 zero-day bugs (of which 18 were confirmed by developers), including 10 new vulnerabilities, in rigorously-tested distributed systems such as Braft, Dqlite, and Redis. 6 new CVEs have been assigned.
翻译:灰盒模糊测试是用于发现顺序程序中漏洞的轻量级首选方法。它通过利用从测试执行观察中获得的反馈函数,对程序输入域进行有偏随机搜索,从而在效率与有效性之间取得平衡。然而,在当前分布式系统测试实践中,主流方法仍仅局限于黑盒工具,这些工具不会尝试推断和利用系统历史行为的知识来引导漏洞搜索。本文提出Mallory:首个面向分布式系统的灰盒模糊测试框架。与Jepsen等主流黑盒分布式系统模糊测试工具(这类工具通过随机注入网络分区和节点故障,或遵循人工定义的时间表来搜索漏洞)不同,Mallory具有自适应性。它运用一种新型度量标准,通过选择不同的故障序列来学习如何最大化观测到的系统行为数量,从而提升发现新漏洞的概率。本方法的关键支撑在于时间线驱动测试和时间线抽象这两个创新概念:前者提供引导有偏随机搜索故障的反馈函数,后者则动态构建系统行为的Lamport时间线,将其抽象为"发生先于"摘要,并依据对这些摘要的实时观测结果来注入故障。我们在多种广泛应用的工业级分布式系统上评估了Mallory。与当前最先进的黑盒模糊测试工具Jepsen相比,Mallory能探索更多系统行为,且发现漏洞所需时间更短。在Braft、Dqlite、Redis等经过严格测试的分布式系统中,Mallory共发现22个零日漏洞(其中18个已获开发者确认),包括10个新安全漏洞,并已分配6个全新CVE编号。