Hybrid testing that integrates fuzzing, symbolic execution, and sampling has demonstrated superior testing efficiency compared to individual techniques. However, the state-of-the-art (SOTA) hybrid testing tools do not fully exploit the capabilities of symbolic execution and sampling in two key aspects. First, the SOTA hybrid testing tools employ tailored symbolic execution engines that tend to over-prune branches, leading to considerable time wasted waiting for seeds from the fuzzer and missing opportunities to discover crashes. Second, existing methods do not apply sampling to the appropriate branches and therefore cannot utilize the full capability of sampling. To address these two limitations, we propose a novel hybrid testing architecture that combines the precision of conventional symbolic execution with the scalability of tailored symbolic execution engines. Based on this architecture, we propose several principles for combining fuzzing, symbolic execution, and sampling. We implement our method in a hybrid testing tool S$^2$F. To evaluate its effectiveness, we conduct extensive experiments on 15 real-world programs. Experimental results demonstrate that S$^2$F outperforms the SOTA tool, achieving an average improvement of 6.14% in edge coverage and 32.6% in discovered crashes. Notably, our tool uncovers three previously unknown crashes in real-world programs.
翻译:混合测试通过集成模糊测试、符号执行与采样技术,已展现出相较于单一技术更优的测试效率。然而,当前最先进的混合测试工具在两个方面未能充分发挥符号执行与采样的潜力。首先,现有工具采用的定制化符号执行引擎倾向于过度剪枝分支,导致大量时间浪费于等待模糊测试生成的种子,并错失发现程序崩溃的机会。其次,现有方法未能将采样技术应用于合适的分支,因而无法充分利用采样的全部能力。为克服这两点局限,我们提出一种新型混合测试架构,该架构融合了传统符号执行的精确性与定制化符号执行引擎的可扩展性。基于此架构,我们提出了若干融合模糊测试、符号执行与采样的指导原则。我们将该方法实现为混合测试工具S$^2$F。为评估其有效性,我们在15个真实世界程序上进行了广泛实验。实验结果表明,S$^2$F优于当前最先进工具,在边覆盖率和发现的崩溃数量上分别平均提升了6.14%和32.6%。值得注意的是,我们的工具在真实世界程序中发现了三个此前未知的崩溃。