Autoscaling is critical for ensuring optimal performance and resource utilization in cloud applications with dynamic workloads. However, traditional autoscaling technologies are typically no longer applicable in microservice-based applications due to the diverse workload patterns and complex interactions between microservices. Specifically, the propagation of performance anomalies through interactions leads to a high number of abnormal microservices, making it difficult to identify the root performance bottlenecks (PBs) and formulate appropriate scaling strategies. In addition, to balance resource consumption and performance, the existing mainstream approaches based on online optimization algorithms require multiple iterations, leading to oscillation and elevating the likelihood of performance degradation. To tackle these issues, we propose PBScaler, a bottleneck-aware autoscaling framework designed to prevent performance degradation in a microservice-based application. The key insight of PBScaler is to locate the PBs. Thus, we propose TopoRank, a novel random walk algorithm based on the topological potential to reduce unnecessary scaling. By integrating TopoRank with an offline performance-aware optimization algorithm, PBScaler optimizes replica management without disrupting the online application. Comprehensive experiments demonstrate that PBScaler outperforms existing state-of-the-art approaches in mitigating performance issues while conserving resources efficiently.
翻译:自动伸缩对于确保动态工作负载下的云应用实现最优性能与资源利用率至关重要。然而,由于微服务间多样化的负载模式及复杂交互,传统自动伸缩技术通常不再适用于基于微服务的应用。具体而言,性能异常通过交互传播导致大量异常微服务,从而难以识别根本性能瓶颈并制定合理的伸缩策略。此外,为平衡资源消耗与性能,现有基于在线优化算法的主流方法需多次迭代,易引发振荡并增加性能退化风险。针对上述问题,我们提出PBScaler——一种旨在防止微服务应用性能退化的瓶颈感知自动伸缩框架。PBScaler的核心思路在于定位性能瓶颈。为此,我们提出TopoRank——一种基于拓扑势的新型随机游走算法,以减少不必要的伸缩操作。通过将TopoRank与离线性能感知优化算法相结合,PBScaler可在不影响在线应用的前提下优化副本管理。综合实验表明,PBScaler在缓解性能问题及高效节约资源方面均优于现有最先进方法。