Misconfiguration, excessive privilege, and fragmented controls remain major causes of cloud-infrastructure incidents. This paper proposes an open-source framework that contributes a cross-platform identity-resource graph for Kubernetes and OpenStack, a policy-to-evidence data model linking OPA/Gatekeeper and Checkov results to live assets, an identity-aware correlation algorithm for reducing noisy runtime alerts, and a guarded remediation workflow that converts validated policy violations into Kubernetes patches or Terraform plans. The evaluation is made reproducible by specifying workload generation, injected misconfiguration classes, run repetitions, metric definitions, and statistical reporting. In a 50-200 node private-cloud testbed, the framework reduced assessment time from 120.4 +/- 6.8 min to 18.2 +/- 1.7 min, lowered the false-positive rate from 12.1% to 4.7%, and increased checked component coverage from 48% to 92%. The reported 62% reduction in observable events corresponding to injected violations and approximately 40% cost reduction are scoped to the defined 30-day operational test and one-year 200-node cost model, respectively, and are not claimed as hyperscale results.
翻译:配置错误、过度授权和碎片化控制仍是云基础设施事件的主要成因。本文提出一个开源框架,该框架贡献了跨平台的身份资源关系图(支持Kubernetes和OpenStack)、将OPA/Gatekeeper和Checkov结果关联至实时资产的策略证据数据模型、用于降低运行时告警噪声的身份感知关联算法,以及将已验证策略违规转化为Kubernetes补丁或Terraform计划的安全修复工作流。通过指定工作负载生成、注入配置错误类别、重复运行次数、指标定义和统计报告方式,确保了评估的可复现性。在50-200节点的私有云测试环境中,该框架将评估时间从120.4±6.8分钟缩短至18.2±1.7分钟,误报率从12.1%降至4.7%,受检组件覆盖率从48%提升至92%。针对注入违规的可观测事件减少62%及成本降低约40%的结论,分别限定于30天运维测试和200节点一年期成本模型,未宣称适用于超大规模场景。