Graph processing systems are essential for analyzing large-scale data with complex relationships, yet most existing frameworks rely on statically provisioned clusters, resulting in poor elasticity and inefficient resource utilization under dynamic workloads. Serverless computing offers automatic scaling and fine-grained billing, but existing serverless graph systems suffer from performance limitations due to inefficient state management and high communication overhead through external storage. We present GraphFlash, a fast and elastic graph processing framework built on serverless infrastructure. GraphFlash adopts a subgraph-centric programming model and leverages shared external storage for coordination and communication, enabling stateless, fine-grained function execution. It supports two execution modes: rotating mode for resource-constrained environments and pinned mode for higher performance when resources are sufficient. To address serverless limitations, GraphFlash introduces system-level optimizations, including partition-aware key aggregation, intra-function partition co-location, and superstep-aware activation. Across multiple graph algorithms and datasets, GraphFlash outperforms existing serverless-compatible systems by up to 127x in execution time and reduces resource consumption by up to 98% under higher-resource configurations, while matching the performance of traditional distributed frameworks on large workloads. Even with limited resources, it achieves up to 48x speedup and 99.97% cost reduction over prior serverless solutions, demonstrating that GraphFlash makes serverless graph processing practical and performant.
翻译:摘要:图处理系统对于分析具有复杂关系的大规模数据至关重要,但现有大多数框架依赖于静态配置的集群,导致在动态工作负载下弹性不足且资源利用效率低下。无服务器计算提供了自动扩缩容和细粒度计费能力,然而现有无服务器图系统因低效的状态管理和通过外部存储产生的高通信开销而面临性能瓶颈。我们提出GraphFlash,一个构建于无服务器基础设施上的快速弹性图处理框架。GraphFlash采用子图中心编程模型,并利用共享外部存储进行协调与通信,从而实现无状态、细粒度的函数执行。它支持两种执行模式:面向资源受限环境的旋转模式和资源充足时提供更高性能的固定模式。为克服无服务器限制,GraphFlash引入了系统级优化,包括分区感知键聚合、函数内分区协同部署以及超步感知激活。在多种图算法和数据集上,GraphFlash相比现有无服务器兼容系统在执行时间上最高提升127倍,在高资源配置下资源消耗降低最高98%,同时在大规模工作负载上达到传统分布式框架的性能水平。即使在资源受限条件下,相较于此前无服务器解决方案,它仍能实现最高48倍加速和99.97%成本降低,证明GraphFlash使无服务器图处理变得实用且高效。