Optimizing simultaneous autoscaling for serverless cloud computing

This paper explores resource allocation in serverless cloud computing platforms and proposes an optimization approach for autoscaling systems. Serverless computing relieves users from resource management tasks, enabling focus on application functions. However, dynamic resource allocation and function replication based on changing loads remain crucial. Typically, autoscalers in these platforms utilize threshold-based mechanisms to adjust function replicas independently. We model applications as interconnected graphs of functions, where requests probabilistically traverse the graph, triggering associated function execution. Our objective is to develop a control policy that optimally allocates resources on servers, minimizing failed requests and response time in reaction to load changes. Using a fluid approximation model and Separated Continuous Linear Programming (SCLP), we derive an optimal control policy that determines the number of resources per replica and the required number of replicas over time. We evaluate our approach using a simulation framework built with Python and simpy. Comparing against threshold-based autoscaling, our approach demonstrates significant improvements in average response times and failed requests, ranging from 15% to over 300% in most cases. We also explore the impact of system and workload parameters on performance, providing insights into the behavior of our optimization approach under different conditions. Overall, our study contributes to advancing resource allocation strategies, enhancing efficiency and reliability in serverless cloud computing platforms.

翻译：本文探讨无服务器云计算平台中的资源分配问题，并提出一种面向自动扩缩容系统的优化方法。无服务器计算使用户免于资源管理任务，从而能够专注于应用功能。然而，基于负载变化的动态资源分配与函数复制仍至关重要。通常，此类平台中的自动扩缩容机制采用基于阈值的方法独立调整函数副本。我们将应用程序建模为函数互联图，请求以概率方式遍历该图，触发关联的函数执行。我们的目标是开发一种控制策略，以在服务器上优化分配资源，在应对负载变化时最小化请求失败率和响应时间。通过流体近似模型和分离连续线性规划（SCLP），我们推导出一种最优控制策略，该策略可确定每个副本的资源数量及随时间所需副本数量。我们使用基于Python和SimPy构建的仿真框架评估了所提方法。与基于阈值的自动扩缩容相比，我们的方法在平均响应时间和请求失败率方面表现出显著改进，多数情况下改进幅度介于15%至超过300%之间。我们还探究了系统与工作负载参数对性能的影响，揭示了所提优化方法在不同条件下的行为特性。总体而言，本研究有助于推进资源分配策略，提升无服务器云计算平台的效率与可靠性。