Adaptive and AI-Augmented Security Testing: A Systematic Survey of Program Analysis, Feedback-Driven Testing, and Hybrid Learning-Based Approaches

Modern software systems are increasingly developed within rapid continuous integration and deployment (CI/CD) pipelines, where ensuring security prior to release presents significant technical and organizational challenges. Traditional static and dynamic analysis tools provide valuable structural and behavioral insights, yet they often operate in non-adaptive workflows and produce large volumes of warnings requiring manual triage. Feedback-driven fuzzing and search-based testing approaches have demonstrated the power of iterative input refinement guided by execution signals, while large language models (LLMs) have shown promise in automated test generation but frequently lack semantic grounding in program structure. This paper presents a systematic survey of adaptive and AI-augmented security testing research across five domains: (1) structural program analysis for vulnerability detection, (2) DevSecOps and continuous security testing, (3) feedback-driven fuzzing and search-based testing, (4) LLM-based automated test generation, and (5) emerging hybrid systems integrating program analysis with adaptive learning. We analyze fifty-five peer-reviewed studies drawn from a systematic search of four major databases yielding 22,088 raw records. Our analysis reveals a persistent disconnect between structural program representations (ASTs, CFGs, and CPGs) and adaptive testing mechanisms. We characterize this as structural-adaptive fragmentation: a systematic separation that neither paradigm individually addresses. No existing system incorporates human triage signals as feedback for refining structural models. We conclude by identifying five open research challenges and outlining a unified agenda for semantically grounded, feedback-driven, polyglot security testing frameworks.

翻译：现代软件系统日益在快速持续集成与持续部署（CI/CD）流水线中开发，确保发布前的安全面临显著的技术与组织挑战。传统静态与动态分析工具能提供有价值的结构与行为洞察，但它们往往在非自适应工作流中运行，并产生大量需人工分诊的告警。反馈驱动模糊测试与基于搜索的测试方法展示了通过执行信号引导迭代输入优化的能力，而大型语言模型（LLM）在自动化测试生成方面展现出潜力，但常缺乏程序结构的语义基础。本文从五个领域对自适应与AI增强的安全性测试研究进行系统综述：（1）面向漏洞检测的结构化程序分析，（2）DevSecOps与持续安全测试，（3）反馈驱动模糊测试与基于搜索的测试，（4）基于LLM的自动化测试生成，以及（5）融合程序分析与自适应学习的新兴混合系统。我们从四大数据库的系统搜索中筛选出55篇经过同行评审的研究，原始记录总数达22,088条。分析揭示，结构化程序表示（抽象语法树、控制流图、程序依赖图）与自适应测试机制之间存在持续脱节。我们将此特征化为结构-自适应碎片化：一种两种范式均无法单独解决的系统性分离。现有系统均未将人工分诊信号作为反馈用于改进结构模型。最后，我们识别出五大开放研究挑战，并提出了一个面向语义基础、反馈驱动、多语言安全测试框架的统一议程。