AI Code Sandboxes: A Comparative Security Study. Part 1 of 2 -- Engine-Level Properties (Attack Surface, Leakage, Stackability, CVE History, Patch Cadence, Fuzzing)

翻译：AI代码沙箱：一项比较安全性研究。第一部分（共两部分）——引擎级属性（攻击面、信息泄漏、可堆叠性、CVE历史、补丁发布节奏、模糊测试）

George Andronchik,Pavel Lokhmakov

from arxiv, 61 pages, 7 figures, 33 tables; Part 1 of 2; companion code repository (Apache-2.0): https://github.com/orbitalab/RnD-ai-sandboxes-sec-study-part-1

This paper reads six engine-level measurements together -- 1.1 host attack surface, 1.2 information leakage, 1.3 defense-in-depth stackability, 1.4 public CVE history, 1.5 patch cadence, and 1.6 upstream fuzzing posture -- to describe how five AI-sandbox products isolate guest code from the host kernel. No single axis is a sufficient basis for a comparative judgement; the cross-axis reading is the load-bearing analysis. Three high-level findings: (1) engine classes (microVM, userspace kernel, OCI container) separate cleanly on every architectural axis, but products within a class do not; (2) product pin policy is the dominant operator-facing variable -- engine-side patch latency aggregates to ~0 days for coordinated disclosures, while downstream lag spans 0 days to 471+ days to "opaque" to infinity; (3) fuzzing investment splits into three tiers, and the strongest combination -- microVM x continuous public fuzzer -- is unoccupied in this set, leaving the "0 published CVEs x no upstream fuzzer x no academic study" intersection structurally unmeasured. We report per-axis orderings, per-product portraits, and a threat-model qualification matrix; no overall ranking is proposed. Companion repository (code, Apache-2.0): https://github.com/orbitalab/RnD-ai-sandboxes-sec-study-part-1. License: CC BY 4.0.

翻译：本文综合考察六个引擎级指标——1.1 主机攻击面、1.2 信息泄漏、1.3 纵深防御可堆叠性、1.4 公开CVE历史、1.5 补丁发布节奏以及1.6 上游模糊测试状态——以描述五款AI沙箱产品如何将访客代码与主机内核隔离。单一维度不足以支撑比较性判断，跨维度分析才是关键论证。三项高层次发现：(1) 引擎类别（微VM、用户态内核、OCI容器）在各个架构维度上均呈现清晰分离，但同类产品间差异不显著；(2) 产品固定策略是面向运维人员的主要变量——引擎侧补丁延迟在协同披露场景下合计约0天，而下游滞后跨度从0天到471天以上，甚至达到“不透明”或无限期；(3) 模糊测试投入分为三个层级，而最强组合——微VM × 持续公开模糊测试——在本集合中空缺，导致“0发布CVE × 无上游模糊测试 × 无学术研究”这一交集在结构上未被测量。我们报告了各维度排序、各产品画像及威胁模型对照矩阵，但不提出整体排名。配套仓库（代码，Apache-2.0）：https://github.com/orbitalab/RnD-ai-sandboxes-sec-study-part-1。许可证：CC BY 4.0。