通过操作系统协同设计驯服无服务器冷启动 (Taming Serverless Cold Starts Through OS Co-Design)

Serverless computing promises fine-grained elasticity and operational simplicity, fueling widespread interest from both industry and academia. Yet this promise is undercut by the cold setart problem, where invoking a function after a period of inactivity triggers costly initialization before any work can begin. Even with today's high-speed storage, the prevailing view is that achieving sub-millisecond cold starts requires keeping state resident in memory. This paper challenges that assumption. Our analysis of existing snapshot/restore mechanisms show that OS-level limitations, not storage speed, are the real barrier to ultra-fast restores from disk. These limitations force current systems to either restore state piecemeal in a costly manner or capture too much state, leading to longer restore times and unpredictable performance. Futhermore, current memory primitives exposed by the OS make it difficult to reliably fetch data into memory and avoid costly runtime page faults. To overcome these barriers, we present Spice, an execution engine purpose-built for serverless snapshot/restore. Spice integrates directly with the OS to restore kernel state without costly replay and introduces dedicated primitives for restoring memory mappings efficiently and reliably. As a result, Spice delivers near-warm performance on cold restores from disk, reducing latency by up to 14.9x over state-of-the-art process-based systems and 10.6x over VM-based systems. This proves that high performance and memory elasticity no longer need to be a trade-off in serverless computing.

翻译：无服务器计算以其细粒度弹性和运维简便性，激发了工业界与学术界的广泛兴趣。然而，这一前景被冷启动问题所削弱：当函数在闲置一段时间后被调用时，需经历耗时的初始化过程才能开始工作。尽管当前存储设备速度已大幅提升，主流观点仍认为实现亚毫秒级冷启动必须将状态常驻内存。本文挑战了这一假设。我们对现有快照/恢复机制的分析表明，阻碍从磁盘实现超快速恢复的真正瓶颈在于操作系统层面的限制，而非存储速度。这些限制迫使现有系统要么以高成本方式逐片段恢复状态，要么捕获过多状态，导致恢复时间延长和性能不可预测。此外，操作系统当前提供的内存原语难以可靠地将数据预取至内存，也无法避免运行时的昂贵缺页异常。为突破这些障碍，我们提出了专为无服务器快照/恢复设计的执行引擎Spice。Spice直接与操作系统集成，无需高成本重放即可恢复内核状态，并引入了专用原语以实现高效可靠的内存映射恢复。实验表明，Spice在基于磁盘的冷恢复中实现了接近热启动的性能，相较于最先进的基于进程的系统延迟降低达14.9倍，相较于基于虚拟机的系统降低达10.6倍。这证明在无服务器计算中，高性能与内存弹性不再需要相互权衡。