In this paper, we present Flock, a cloud-native streaming query engine that leverages the on-demand elasticity of Function-as-a-Service (FaaS) platforms to perform real-time data analytics. Traditional server-centric deployments often suffer from resource under- or over-provisioning, leading to resource wastage or performance degradation. Flock addresses these issues by providing more fine-grained elasticity that can dynamically match the per-query basis with continuous scaling, and its billing methods are more fine-grained with millisecond granularity, making it a low-cost solution for stream processing. Our approach, payload invocation, eliminates the need for external storage services and eliminates the requirement for a query coordinator in the data architecture. Our evaluation shows that Flock significantly outperforms state-of-the-art systems in terms of cost, especially on ARM processors, making it a promising solution for real-time data analytics on FaaS platforms.
翻译:本文提出Flock——一种云原生流式查询引擎,通过利用函数即服务(FaaS)平台的按需弹性能力实现实时数据分析。传统服务器为中心的部署模式常面临资源不足或过度配置问题,导致资源浪费或性能下降。Flock通过提供更细粒度的弹性机制解决了这些问题:既能以持续缩放的方式动态匹配每个查询的需求,又可采用毫秒级计费方式,从而成为低成本的流处理解决方案。本研究的"负载调用"方法无需外部存储服务,也无需在数据架构中设置查询协调器。实验评估表明,Flock在成本方面显著优于现有系统,尤其在ARM处理器上表现突出,为FaaS平台上的实时数据分析提供了极具前景的解决方案。