Traditional database fuzzing techniques primarily focus on syntactic correctness and general SQL structures, leaving critical yet obscure DBMS features, such as system-level modes (e.g., GTID), programmatic constructs (e.g., PROCEDURE), advanced process commands (e.g., KILL), largely underexplored. Although rarely triggered by typical inputs, these features can lead to severe crashes or security issues when executed under edge-case conditions. In this paper, we present FuzzySQL, a novel LLM-powered adaptive fuzzing framework designed to uncover subtle vulnerabilities in DBMS special features. FuzzySQL combines grammar-guided SQL generation with logic-shifting progressive mutation, a novel technique that explores alternative control paths by negating conditions and restructuring execution logic, synthesizing structurally and semantically diverse test cases. To further ensure deeper execution coverage of the back end, FuzzySQL employs a hybrid error repair pipeline that unifies rule-based patching with LLM-driven semantic repair, enabling automatic correction of syntactic and context-sensitive failures. We evaluate FuzzySQL across multiple DBMSs, including MySQL, MariaDB, SQLite, PostgreSQL and Clickhouse, uncovering 37 vulnerabilities, 7 of which are tied to under-tested DBMS special features. As of this writing, 29 cases have been confirmed with 9 assigned CVE identifiers, 14 already fixed by vendors, and additional vulnerabilities scheduled to be patched in upcoming releases. Our results highlight the limitations of conventional fuzzers in semantic feature coverage and demonstrate the potential of LLM-based fuzzing to discover deeply hidden bugs in complex database systems.
翻译:传统的数据库模糊测试技术主要关注语法正确性和通用SQL结构,使得许多关键但晦涩的数据库管理系统功能,如系统级模式(例如GTID)、程序化构造(例如PROCEDURE)、高级进程命令(例如KILL)在很大程度上未被充分探索。尽管这些功能很少被典型输入触发,但在边界条件下执行时可能导致严重的崩溃或安全问题。本文提出FuzzySQL,一种新颖的基于大语言模型的自适应模糊测试框架,旨在揭示数据库管理系统特殊功能中的细微漏洞。FuzzySQL结合了语法引导的SQL生成与逻辑转换渐进式变异——一种通过否定条件和重构执行逻辑来探索替代控制路径的新技术,从而合成结构和语义多样化的测试用例。为了进一步确保后端更深层次的执行覆盖,FuzzySQL采用了一种混合错误修复流程,统一了基于规则的修补与大语言模型驱动的语义修复,能够自动纠正语法和上下文相关的错误。我们在多个数据库管理系统(包括MySQL、MariaDB、SQLite、PostgreSQL和Clickhouse)上评估FuzzySQL,共发现37个漏洞,其中7个与测试不足的数据库管理系统特殊功能相关。截至本文撰写时,已有29个案例得到确认,其中9个被分配了CVE标识,14个已被供应商修复,其余漏洞计划在后续版本中修补。我们的研究结果突显了传统模糊测试在语义功能覆盖方面的局限性,并展示了基于大语言模型的模糊测试在发现复杂数据库系统中深度隐藏缺陷方面的潜力。