Large Language Model assisted Hybrid Fuzzing

Greybox fuzzing is one of the most popular methods for detecting software vulnerabilities, which conducts a biased random search within the program input space. To enhance its effectiveness in achieving deep coverage of program behaviors, greybox fuzzing is often combined with concolic execution, which performs a path-sensitive search over the domain of program inputs. In hybrid fuzzing, conventional greybox fuzzing is followed by concolic execution in an iterative loop, where reachability roadblocks encountered by greybox fuzzing are tackled by concolic execution. However, such hybrid fuzzing still suffers from difficulties conventionally faced by symbolic execution, such as the need for environment modeling and system call support. In this work, we show how to achieve the effect of concolic execution without having to compute and solve symbolic path constraints. When coverage-based greybox fuzzing reaches a roadblock in terms of reaching certain branches, we conduct a slicing on the execution trace and suggest modifications of the input to reach the relevant branches. A Large Language Model (LLM) is used as a solver to generate the modified input for reaching the desired branches. Compared with both the vanilla greybox fuzzer AFL and hybrid fuzzers Intriguer and Qsym, our LLM-based hybrid fuzzer HyLLfuzz (pronounced "hill fuzz") demonstrates superior coverage. Furthermore, the LLM-based concolic execution in HyLLfuzz takes a time that is 4-19 times faster than the concolic execution running in existing hybrid fuzzing tools. This experience shows that LLMs can be effectively inserted into the iterative loop of hybrid fuzzers, to efficiently expose more program behaviors.

翻译：灰盒模糊测试是检测软件漏洞最流行的方法之一，它在程序输入空间内执行有偏随机搜索。为了提升其在深度覆盖程序行为方面的有效性，灰盒模糊测试常与具体执行相结合，后者在程序输入域上执行路径敏感搜索。在混合模糊测试中，传统的灰盒模糊测试与具体执行在迭代循环中交替进行，其中灰盒模糊测试遇到的可达性障碍由具体执行处理。然而，此类混合模糊测试仍面临符号执行传统上遇到的困难，例如需要环境建模和系统调用支持。在本工作中，我们展示了如何在不计算和求解符号路径约束的情况下实现具体执行的效果。当基于覆盖率的灰盒模糊测试在到达特定分支方面遇到障碍时，我们对执行轨迹进行切片，并提出修改输入以到达相关分支的建议。使用大型语言模型作为求解器来生成到达目标分支的修改输入。与原始灰盒模糊测试工具AFL以及混合模糊测试工具Intriguer和Qsym相比，我们基于LLM的混合模糊测试工具HyLLfuzz（发音为"hill fuzz"）展现出更优的覆盖率。此外，HyLLfuzz中基于LLM的具体执行耗时比现有混合模糊测试工具中的具体执行快4至19倍。这一实践表明，LLM可有效嵌入混合模糊测试工具的迭代循环中，以高效暴露更多程序行为。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日