FaaS platforms rely on cluster managers like Kubernetes for resource management. Kubernetes is popular due to its state-centric APIs that decouple the control plane into modular controllers. However, to scale out a burst of FaaS instances, message passing becomes the primary bottleneck as controllers have to exchange extensive state through the API Server. Existing solutions opt for a clean-slate redesign of cluster managers, but at the expense of compatibility with existing ecosystem and substantial engineering effort. We present KUBEDIRECT, a Kubernetes-based cluster manager for FaaS. We find that there exists a common narrow waist across FaaS platform that allows us to achieve both efficiency and external compatibility. Our insight is that the sequential structure of the narrow waist obviates the need for a single source of truth, allowing us to bypass the API Server and perform direct message passing for efficiency. However, our approach introduces a set of ephemeral states across controllers, making it challenging to enforce end-to-end semantics due to the absence of centralized coordination. KUBEDIRECT employs a novel state management scheme that leverages the narrow waist as a hierarchical write-back cache, ensuring consistency and convergence to the desired state. KUBEDIRECT can seamlessly integrate with Kubernetes, adding ~150 LoC per controller. Experiments show that KUBEDIRECT reduces serving latency by 26.7x over Knative, and has similar performance as the state-of-the-art clean-slate platform Dirigent.
翻译:函数即服务(FaaS)平台依赖Kubernetes等集群管理器进行资源管理。Kubernetes因其以状态为中心的API而广受欢迎,这些API将控制平面解耦为模块化控制器。然而,当需要横向扩展突发性FaaS实例时,消息传递成为主要瓶颈,因为控制器必须通过API服务器交换大量状态。现有解决方案选择对集群管理器进行全新设计,但代价是牺牲与现有生态系统的兼容性并需要大量工程投入。本文提出KUBEDIRECT,一种基于Kubernetes的FaaS集群管理器。我们发现FaaS平台中存在一个通用的窄腰结构,使我们能够同时实现高效性和外部兼容性。我们的核心见解是:窄腰的序列化结构消除了对单一事实来源的需求,允许我们绕过API服务器直接进行消息传递以提高效率。然而,这种方法在控制器间引入了一系列瞬态,由于缺乏集中式协调,实施端到端语义面临挑战。KUBEDIRECT采用一种新颖的状态管理方案,将窄腰结构作为分层写回缓存,确保状态一致性并最终收敛至目标状态。KUBEDIRECT可与Kubernetes无缝集成,每个控制器仅需增加约150行代码。实验表明,KUBEDIRECT相比Knative降低服务延迟达26.7倍,其性能与最先进的全新平台Dirigent相当。