Modern cybersecurity requires systematic ways to evaluate how detection systems respond to evolving and previously unseen attack behaviors. Existing malware repositories largely capture known patterns and provide limited support for stress-testing defenses against novel threats. To address this, we present MalGEN, a modular testbed that models adversarial workflows and generates executable artifacts in a controlled environment. The framework decomposes high-level attack objectives into structured stages, enabling the synthesis of diverse and multi-stage behaviors. We evaluate MalGEN across 1,920 benchmark settings covering multiple platforms and behavioral objectives, resulting in 977 executable samples. Analysis shows that the generated artifacts exhibit a wide range of malicious techniques and multi-stage attack patterns. However, 45.71% of these samples remain undetected by existing detection engines, which reveals notable gaps in current defenses. These findings provide practical insights into the limitations of widely used detection approaches and support the development of more robust security evaluation and testing practices.
翻译:现代网络安全需要系统化的方法来评估检测系统如何应对不断演变且前所未见的攻击行为。现有恶意软件库主要捕获已知模式,在面对新型威胁时对防御系统的压力测试支持有限。为此,我们提出MalGEN——一个模块化测试平台,可在受控环境中对对抗性工作流进行建模并生成可执行构件。该框架将高层攻击目标分解为结构化阶段,从而合成多样化的多阶段行为。我们在覆盖多个平台与行为目标的1,920个基准设置下对MalGEN进行评估,共生成977个可执行样本。分析表明,生成的构件展现出广泛的恶意技术与多阶段攻击模式。然而,其中45.71%的样本未被现有检测引擎识别,揭示了当前防御体系中的显著缺陷。这些发现为广泛使用的检测方法的局限性提供了实践洞察,并支持开发更鲁棒的安全评估与测试实践。