Context: Exhaustive fuzzing of modern JavaScript engines is infeasible due to the vast number of program states and execution paths. Coverage-guided fuzzers waste effort on low-risk inputs, often ignoring vulnerability-triggering ones that do not increase coverage. Existing heuristics proposed to mitigate this require expert effort, are brittle, and hard to adapt. Objective: We propose a data-centric, LLM-boosted alternative that learns from historical vulnerabilities to automatically identify minimal static (code) and dynamic (runtime) features for detecting high-risk inputs. Method: Guided by historical V8 bugs, iterative prompting generated 115 static and 49 dynamic features, with the latter requiring only five trace flags, minimizing instrumentation cost. After feature selection, 41 features remained to train an XGBoost model to predict high-risk inputs during fuzzing. Results: Combining static and dynamic features yields over 85% precision and under 1% false alarms. Only 25% of these features are needed for comparable performance, showing that most of the search space is irrelevant. Conclusion: This work introduces feature-guided fuzzing, an automated data-driven approach that replaces coverage with data-directed inference, guiding fuzzers toward high-risk states for faster, targeted, and reproducible vulnerability discovery. To support open science, all scripts and data are available at https://github.com/KKGanguly/DataCentricFuzzJS .
翻译:背景:由于程序状态和执行路径数量庞大,对现代JavaScript引擎进行穷举式模糊测试并不可行。基于覆盖率的模糊测试器将精力浪费在低风险输入上,常常忽略那些不增加覆盖率但会触发漏洞的输入。现有为缓解此问题而提出的启发式方法需要专家投入,且具有脆弱性,难以适应。目标:我们提出一种以数据为中心、大语言模型增强的替代方案,通过学习历史漏洞自动识别用于检测高风险输入的最小静态(代码)和动态(运行时)特征。方法:在历史V8漏洞的指导下,通过迭代式提示生成生成了115个静态特征和49个动态特征,其中动态特征仅需五个跟踪标志,最大限度地降低了插桩成本。经过特征选择后,保留41个特征用于训练一个XGBoost模型,以在模糊测试期间预测高风险输入。结果:结合静态和动态特征,可获得超过85%的精确率且误报率低于1%。仅需其中25%的特征即可达到相当的性能,这表明搜索空间的大部分是无关的。结论:本文引入了特征引导的模糊测试,这是一种自动化的数据驱动方法,它用数据导向的推断取代了覆盖率引导,将模糊测试器引向高风险状态,以实现更快、更具针对性且可复现的漏洞发现。为支持开放科学,所有脚本和数据均可在 https://github.com/KKGanguly/DataCentricFuzzJS 获取。