Serverless platforms face a trade-off: conventional cluster managers like Kubernetes offer compatibility for co-locating Function-as-a-Service (FaaS) and Backend-as-a-Service (BaaS) components of serverless applications, at the cost of high cold-start latency, whereas specialized FaaS-only systems like Dirigent achieve low latency by sacrificing compatibility, preventing integrated management and optimization. Our analysis reveals that FaaS traffic is bimodal: predictable, sustainable traffic consumes >98% of cluster resources, whereas sporadic, excessive bursts stress the control plane's scaling latency, not its throughput. With these insights, we design PulseNet, a serverless architecture that uses a dual-track control plane tailored to both traffic types. PulseNet's standard track manages sustainable traffic with long-lived, full-featured Regular Instances under a conventional cluster manager, preserving compatibility for the majority of the workload. To handle excessive traffic, an expedited track bypasses the slow manager to rapidly create short-lived, disposable Emergency Instances, minimizing cold-start latency and resource waste from idle instances. This hybrid approach achieves 35% better performance than Dirigent, a FaaS-only system, on a production workload at the same cost and outperforms other Kubernetes-compatible systems by 1.5-3.5x, reducing the cost by up to 70%.
翻译:无服务器平台面临一种权衡:传统的集群管理器(如Kubernetes)能够兼容地共同部署无服务器应用中的函数即服务(FaaS)和后端即服务(BaaS)组件,但其代价是较高的冷启动延迟;而专门的纯FaaS系统(如Dirigent)则通过牺牲兼容性来降低延迟,从而阻碍了集成管理与优化。我们的分析表明,FaaS流量呈现双峰特征:可预测的持续性流量消耗了超过98%的集群资源,而偶发的过量突发流量则对控制平面的扩展延迟(而非吞吐量)造成压力。基于这些洞察,我们设计了PulseNet——一种采用双轨控制平面以适应两种流量类型的无服务器架构。PulseNet的标准轨道在传统集群管理器下通过长生命周期、功能完备的常规实例管理持续性流量,从而为大部分工作负载保持兼容性。为处理过量流量,其快速轨道绕过缓慢的管理器,迅速创建短生命周期、可丢弃的应急实例,以最小化冷启动延迟和空闲实例导致的资源浪费。这种混合方法在生产负载上以相同成本实现了比纯FaaS系统Dirigent高35%的性能,并优于其他Kubernetes兼容系统1.5-3.5倍,同时将成本降低高达70%。