This paper reads six engine-level measurements together -- 1.1 host attack surface, 1.2 information leakage, 1.3 defense-in-depth stackability, 1.4 public CVE history, 1.5 patch cadence, and 1.6 upstream fuzzing posture -- to describe how five AI-sandbox products isolate guest code from the host kernel. No single axis is a sufficient basis for a comparative judgement; the cross-axis reading is the load-bearing analysis. Three high-level findings: (1) engine classes (microVM, userspace kernel, OCI container) separate cleanly on every architectural axis, but products within a class do not; (2) product pin policy is the dominant operator-facing variable -- engine-side patch latency aggregates to ~0 days for coordinated disclosures, while downstream lag spans 0 days to 471+ days to "opaque" to infinity; (3) fuzzing investment splits into three tiers, and the strongest combination -- microVM x continuous public fuzzer -- is unoccupied in this set, leaving the "0 published CVEs x no upstream fuzzer x no academic study" intersection structurally unmeasured. We report per-axis orderings, per-product portraits, and a threat-model qualification matrix; no overall ranking is proposed. Companion repository (code, Apache-2.0): https://github.com/orbitalab/RnD-ai-sandboxes-sec-study-part-1. License: CC BY 4.0.
翻译:本文综合考察六个引擎级指标——1.1 主机攻击面、1.2 信息泄漏、1.3 纵深防御可堆叠性、1.4 公开CVE历史、1.5 补丁发布节奏以及1.6 上游模糊测试状态——以描述五款AI沙箱产品如何将访客代码与主机内核隔离。单一维度不足以支撑比较性判断,跨维度分析才是关键论证。三项高层次发现:(1) 引擎类别(微VM、用户态内核、OCI容器)在各个架构维度上均呈现清晰分离,但同类产品间差异不显著;(2) 产品固定策略是面向运维人员的主要变量——引擎侧补丁延迟在协同披露场景下合计约0天,而下游滞后跨度从0天到471天以上,甚至达到“不透明”或无限期;(3) 模糊测试投入分为三个层级,而最强组合——微VM × 持续公开模糊测试——在本集合中空缺,导致“0发布CVE × 无上游模糊测试 × 无学术研究”这一交集在结构上未被测量。我们报告了各维度排序、各产品画像及威胁模型对照矩阵,但不提出整体排名。配套仓库(代码,Apache-2.0):https://github.com/orbitalab/RnD-ai-sandboxes-sec-study-part-1。许可证:CC BY 4.0。