Security teams routinely simulate attacks against their own systems to check whether their monitoring would catch a real intruder. These Breach-and-Attack-Simulation (BAS) tools surface findings, but the security information and event management (SIEM) systems that watch production need detection rules -- and today a human bridges that gap by hand, reading each finding and writing the corresponding Sigma rule (a vendor-neutral detection format). We show this translation can be partially automated when probes are drawn from a locked corpus, so each finding carries a stable identifier back to the originating probe. We describe a deterministic synthesis function that maps each finding to a starter Sigma rule through a small template library (N=23, indexed by categories from the OWASP LLM and Web Top 10), with a back-reference to the originating finding and its MITRE ATT&CK technique. On two locked corpora (17-probe LLM, 23-probe Web), every bypassed-probe finding yields a starter rule, and all 17/17 emitted rules parse and convert to Splunk and Elasticsearch backends. Replayed through a live OpenSearch SIEM, the LLM rules fire on 30% of a held-out AdvBench subset and 14% of HarmBench at 7.7% false positives on a benign baseline; the Web side is validated structurally, not against a held-out attack set. The contribution is a verifiable, byte-stable path from BAS finding to operator-deployable starter rule, re-derivable from the published corpus and template library alone -- trading the breadth of LLM-generative methods for exact reproducibility and a typed traceback from any fired alert to the originating probe.
翻译:安全团队通常会对其自身系统进行模拟攻击,以检验监控能否捕获真实入侵者。这类攻防模拟(BAS)工具能输出检测发现,但负责监控生产环境的SIEM系统需要检测规则——目前,这一差距依赖人工弥合:分析师需手动阅读每条发现并编写对应的Sigma规则(一种厂商中立的检测格式)。我们证明,当探针来源于封闭语料库时,该翻译过程可实现部分自动化,从而使每条发现都带有可回溯至原始探针的稳定标识。我们描述了一种确定性合成函数,通过小型模板库(N=23,基于OWASP LLM与Web Top 10类别索引),将每条发现映射为初始Sigma规则,并附带回溯至原始发现及其MITRE ATT&CK技术的引用。在两个封闭语料库(17探针LLM与23探针Web)上,所有绕过探针的发现均生成初始规则,且17/17条已解析规则能成功转换至Splunk与Elasticsearch后端。通过实时OpenSearch SIEM重放测试,LLM规则在保留测试集AdvBench子集上触发30%的攻击样本,在HarmBench上触发14%的攻击样本,同时良性基线误报率为7.7%;Web侧规则仅通过结构验证,未使用保留攻击集测试。本研究的贡献在于构建了一条从BAS发现到可部署初始规则的可验证、字节稳定路径,仅依赖已发布的语料库与模板库即可复现——以牺牲LLM生成方法的广泛性为代价,换取了精确的可复现性,以及从任何触发警报到原始探针的类型化回溯能力。