AutoSUT: The Environment Semantics Gap in Structured CTI for Adversary Emulation

Structured Cyber Threat Intelligence (CTI) increasingly supports adversary emulation, detection evaluation, and cyber range design, yet each workflow still requires a target System Under Test (SUT) whose environment is not fully described by public CTI. We define the environment semantics gap as a measurable property of structured CTI: the SUT information required for replay-ready instantiation that cannot be derived solely from structured fields. We present AutoSUT, a pipeline that locates where corpus-supported narrowing ends and analyst specification begins. Across ATT&CK Enterprise, Mobile, and ICS STIX bundles, with CAPEC and FiGHT as contrast datasets, we measure platform coverage, software specificity, vulnerability evidence, and deployment compatibility. Platform tags are near-universal, but 97.6% of Enterprise software objects lack version indicators and CPE identifiers. Campaign-level CVE evidence covers only 9.6% of campaigns, even after free-text enrichment, and only 19 of 691 techniques (2.7%) are container-feasible under conservative backend-family assignment. Profile confusion among intrusion sets drops from 1.3% for one linked software item to 0% for two linked software items, indicating that software-evidence density, not CVE enrichment, drives actor-specific SUT screening. Finally, we constructively demonstrate environment non-uniqueness: holding every corpus-supported element fixed and varying only the analyst-authored region yields multiple distinct, campaign-compatible SUTs, including an executable witness running CVE-2021-41773 and coincident witnesses in which structurally different service realizations execute the same attack. Structured CTI, therefore, constrains but does not uniquely determine the executable environment. Replay-ready emulation should accordingly declare which environment commitments the corpus supports and which remain analyst-authored.

翻译：摘要：结构化网络威胁情报（CTI）日益支持对手仿真、检测评估与靶场设计，然而每项工作流仍需依赖特定的被测系统（SUT），其环境无法完全由公开CTI描述。我们将环境语义鸿沟定义为结构化CTI的一个可量化属性：即实现可重放实例化所需的、无法仅从结构化字段推导出的SUT信息。我们提出AutoSUT流水线，该流水线能定位语料支撑的缩减范围与分析师规范起始点之间的边界。通过ATT&CK企业版、移动版及ICS的STIX数据包，并以CAPEC和FiGHT作为对比数据集，我们测量了平台覆盖率、软件特异性、漏洞证据及部署兼容性。平台标签近乎普遍存在，但97.6%的企业版软件对象缺失版本标识符与CPE标识符。战役级CVE证据仅覆盖9.6%的战役，即便经过自由文本增强后亦如此；在保守的后端家族分配下，691项技术中仅19项（2.7%）支持容器化部署。入侵集合间的轮廓混淆率从关联一个软件项时的1.3%降至关联两个软件项时的0%，表明驱动特定行为体SUT筛选的关键因素是软件证据密度而非CVE增强。最后，我们建设性地论证了环境非唯一性：在保持所有语料支撑元素固定、仅变动分析师撰写的区域条件下，可生成多个不同的且与战役兼容的SUT，包括可运行CVE-2021-41773的可执行见证，以及结构不同的服务实现执行相同攻击的并发见证。因此，结构化CTI约束但非唯一确定可执行环境。可重放仿真应据此声明：哪些环境承诺由语料支撑，哪些仍由分析师撰写。