Microservice applications are created as loosely coupled application components and they leverage cloud elasticity to reduce costs and increase development speed. However, microservice applications exhibit complex interactions among dynamically evolving services and highly variable workloads, posing significant challenges to auto-scaling mechanisms. Key issues include service dependency management, performance profiling, anomaly detection, workload characterization, and fine-grained resource allocation. To address these challenges, recent auto-scaling approaches leverage historical and runtime data to adapt resource provisioning and optimize system efficiency. Since 2018, marked by the graduation of Kubernetes as the first Cloud Native Computing Foundation (CNCF) project, microservice applications have been widely deployed on standardized orchestration platforms, fundamentally shifting auto-scaling from coarse-grained to service-level, dependency-aware strategies. Accordingly, this paper surveys state-of-the-art auto-scaling approaches for microservice applications since 2018 and presents a taxonomy along five dimensions: infrastructure, architecture, scaling methods, optimization objectives, and behavior modeling. These perspectives collectively target key objectives, including resource efficiency, cost efficiency, and Service Level Agreement (SLA) assurance, aiming to balance system optimization with SLA compliance. We further present a comprehensive comparison and in-depth analysis of representative approaches, examining their core features, strengths, limitations, and applicable scenarios, as well as their performance across diverse environments and workload conditions.
翻译:微服务应用由松散耦合的应用组件构成,其利用云弹性来降低成本并提升开发速度。然而,微服务应用在动态演化的服务与高度可变的工作负载之间展现出复杂的交互关系,这给自动扩缩容机制带来了重大挑战。关键问题包括服务依赖管理、性能剖析、异常检测、工作负载特征化以及细粒度资源分配。为应对这些挑战,近期的自动扩缩容方法利用历史数据与运行时数据来调整资源供给并优化系统效率。自2018年Kubernetes作为首个云原生计算基金会(CNCF)项目正式毕业以来,微服务应用已在标准化编排平台上得到广泛部署,这从根本上推动了自动扩缩容策略从粗粒度向服务级别、依赖感知方向的转变。基于此,本文系统综述了2018年以来微服务应用自动扩缩容的前沿方法,并提出了一个涵盖五个维度的分类体系:基础设施、架构、扩缩方法、优化目标与行为建模。这些视角共同指向关键目标,包括资源效率、成本效率与服务等级协议(SLA)保障,旨在实现系统优化与SLA合规性之间的平衡。我们进一步对代表性方法进行了全面比较与深入分析,审视了其核心特征、优势、局限性与适用场景,以及它们在不同环境与工作负载条件下的性能表现。