Security Digital Twins (SDTs) provide continuously updated virtual replicas of infrastructure for threat simulation, yet they rely on theoretical CVSS scores to assign lateral-movement probabilities -- creating the Contextual Reality Gap: risk is overestimated where unacknowledged mitigations neutralize exploits, and drastically underestimated where logic flaws bypass all memory-safety defenses. We present the Host Active Verification Engine (HAVE), an SDT extension that deploys a safety-constrained host agent to measure the empirical probability of compromise $\hat{p}$ via maximum-likelihood estimation over snapshot-isolated Bernoulli trials. A Wilson interval-width confidence weight $α_w$ propagates $\hat{p}$ into Monte Carlo simulations via a Bayesian blending rule formally related to the Beta-Binomial posterior. Evaluation across four vulnerability classes, three security tiers, and two production binaries shows HAVE reduces $P_{\text{reach}}$ by 38.2% in false-positive scenarios and increases it by 132.4% in false-negative scenarios, with a net +124.1% correction; post-HAVE estimates vary by only $1.12\times$ across calibration exponents $κ$, versus $4.6\times$ for CVSS-only baselines.
翻译:摘要:安全数字孪生(SDT)提供持续更新的基础设施虚拟副本以进行威胁模拟,但它们依赖理论性的CVSS评分来分配横向移动概率,从而产生上下文现实差距:在未确认的缓解措施能够消除漏洞利用时,风险被高估;而在逻辑缺陷绕过所有内存安全防御时,风险被大幅低估。我们提出主机主动验证引擎(HAVE),这是一种SDT扩展,通过部署安全受限的主机代理,基于快照隔离的伯努利试验进行最大似然估计,来衡量经验性的被攻陷概率 $\hat{p}$。一个威尔逊区间宽度置信权重 $α_w$ 通过贝叶斯混合规则(形式上与Beta-二项式后验相关)将 $\hat{p}$ 传播到蒙特卡洛模拟中。在四种漏洞类别、三种安全层级和两个生产级二进制文件上的评估表明,HAVE在假阳性场景中将 $P_{\text{reach}}$ 降低了38.2%,在假阴性场景中将其提高了132.4%,净修正为+124.1%;事后HAVE估计在不同校准指数 $κ$ 下的变化仅为 $1.12\times$,而纯CVSS基线的变化为 $4.6\times$。