融合无服务器控制平面与传统集群管理器以实现速度与资源效率 (Melding the Serverless Control Plane with the Conventional Cluster Manager for Speed and Resource Efficiency)

Serverless platforms face a trade-off: conventional cluster managers like Kubernetes offer compatibility for co-locating Function-as-a-Service (FaaS) and Backend-as-a-Service (BaaS) components of serverless applications, at the cost of high cold-start latency, whereas specialized FaaS-only systems like Dirigent achieve low latency by sacrificing compatibility, preventing integrated management and optimization. Our analysis reveals that FaaS traffic is bimodal: predictable, sustainable traffic consumes >98% of cluster resources, whereas sporadic, excessive bursts stress the control plane's scaling latency, not its throughput. With these insights, we design PulseNet, a serverless architecture that uses a dual-track control plane tailored to both traffic types. PulseNet's standard track manages sustainable traffic with long-lived, full-featured Regular Instances under a conventional cluster manager, preserving compatibility for the majority of the workload. To handle excessive traffic, an expedited track bypasses the slow manager to rapidly create short-lived, disposable Emergency Instances, minimizing cold-start latency and resource waste from idle instances. This hybrid approach achieves 35% better performance than Dirigent, a FaaS-only system, on a production workload at the same cost and outperforms other Kubernetes-compatible systems by 1.5-3.5x, reducing the cost by up to 70%.

翻译：无服务器平台面临一种权衡：传统的集群管理器（如Kubernetes）能够兼容地共同部署无服务器应用中的函数即服务（FaaS）和后端即服务（BaaS）组件，但其代价是较高的冷启动延迟；而专门的纯FaaS系统（如Dirigent）则通过牺牲兼容性来降低延迟，从而阻碍了集成管理与优化。我们的分析表明，FaaS流量呈现双峰特征：可预测的持续性流量消耗了超过98%的集群资源，而偶发的过量突发流量则对控制平面的扩展延迟（而非吞吐量）造成压力。基于这些洞察，我们设计了PulseNet——一种采用双轨控制平面以适应两种流量类型的无服务器架构。PulseNet的标准轨道在传统集群管理器下通过长生命周期、功能完备的常规实例管理持续性流量，从而为大部分工作负载保持兼容性。为处理过量流量，其快速轨道绕过缓慢的管理器，迅速创建短生命周期、可丢弃的应急实例，以最小化冷启动延迟和空闲实例导致的资源浪费。这种混合方法在生产负载上以相同成本实现了比纯FaaS系统Dirigent高35%的性能，并优于其他Kubernetes兼容系统1.5-3.5倍，同时将成本降低高达70%。

相关内容

服务器

关注 14

服务器，也称伺服器，是提供计算服务的设备。由于服务器需要响应服务请求，并进行处理，因此一般来说服务器应具备承担服务并且保障服务的能力。
服务器的构成包括处理器、硬盘、内存、系统总线等，和通用的计算机架构类似，但是由于需要提供高可靠的服务，因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。

《不同群体规模下的分布式无人水面航行器集群控制器》最新71页

专知会员服务

38+阅读 · 2025年2月17日

《基于智能自适应混合控制实现自主性：采用自适应协作/控制技术的智能跨域多无人系统规划器》107页干货

专知会员服务

57+阅读 · 2024年11月12日

无人机集群编队自主协同控制方法综述

专知会员服务

75+阅读 · 2024年4月15日

无人智能集群系统决策与控制研究进展

专知会员服务

77+阅读 · 2024年3月20日