Contextualizing Sink Knowledge for Java Vulnerability Discovery

Java applications are prone to vulnerabilities stemming from the insecure use of security-sensitive APIs, such as file operations enabling path traversal or deserialization routines allowing remote code execution. These sink APIs encode critical information for vulnerability discovery: the program-specific constraints required to reach them and the exploitation conditions necessary to trigger security flaws. Despite this, existing fuzzers largely overlook such vulnerability-specific knowledge, limiting their effectiveness. We present GONDAR, a sink-centric fuzzing framework that systematically leverages sink API semantics for targeted vulnerability discovery. GONDAR first identifies reachable and exploitable sink call sites through CWE-specific scanning combined with LLM-assisted static filtering. It then deploys two specialized agents that work collaboratively with a coverage-guided fuzzer: an exploration agent generates inputs to reach target call sites by iteratively solving path constraints, while an exploitation agent synthesizes proof-of-concept exploits by reasoning about and satisfying vulnerability-triggering conditions. The agents and fuzzer continuously exchange seeds and runtime feedback, complementing each other. We evaluated GONDAR on real-world Java benchmarks, where it discovers four times more vulnerabilities than Jazzer, the state-of-the-art Java fuzzer. Notably, an earlier GONDAR version contributed to Team Atlanta's first-place CRS in the DARPA AI Cyber Challenge, and is integrated into OSS-CRS, a sandbox project in The Linux Foundation's OpenSSF, to analyze open-source Java projects, where it has already uncovered a zero-day vulnerability.

翻译：Java应用程序易受源于不安全使用安全敏感API的漏洞影响，例如导致路径遍历的文件操作或允许远程代码执行的反序列化例程。这些sink API编码了漏洞发现的关键信息：触发它们所需的程序特定约束以及利用条件。尽管如此，现有模糊测试器大多忽略此类漏洞特定知识，限制了其有效性。我们提出GONDAR，一个以sink为中心的模糊测试框架，系统性地利用sink API语义进行针对性漏洞发现。GONDAR首先通过结合CWE特定扫描与LLM辅助静态过滤，识别可到达且可利用的sink调用点。随后，它部署两个专门代理与覆盖引导模糊测试器协同工作：探索代理通过迭代求解路径约束生成输入以到达目标调用点，而利用代理通过推理并满足漏洞触发条件来合成概念验证利用。代理与模糊测试器持续交换种子和运行时反馈，相互补充。我们在真实Java基准测试上评估GONDAR，其发现的漏洞数量是当前最先进的Java模糊测试器Jazzer的四倍。值得注意的是，GONDAR的早期版本为亚特兰大团队在DARPA AI网络挑战赛中取得CRS第一名做出了贡献，并已集成至Linux基金会OpenSSF的沙箱项目OSS-CRS，用于分析开源Java项目，其中已发现一个零日漏洞。