Modern microservice systems exhibit continuous structural evolution in their runtime call graphs due to workload fluctuations, fault responses, and deployment activities. Despite this complexity, our analysis of over 500,000 production traces from ByteDance reveals a latent regularity: execution paths concentrate around a small set of recurring invocation patterns. However, existing resource management approaches fail to exploit this structure. Industrial autoscalers like Kubernetes HPA ignore inter-service dependencies, while recent academic methods often assume static topologies, rendering them ineffective under dynamic execution contexts. In this work, we propose Morphis, a dependency-aware provisioning framework that unifies pattern-aware trace analysis with global optimization. It introduces structural fingerprinting that decomposes traces into a stable execution backbone and interpretable deviation subgraphs. Then, resource allocation is formulated as a constrained optimization problem over predicted pattern distributions, jointly minimizing aggregate CPU usage while satisfying end-to-end tail-latency SLOs. Our extensive evaluations on the TrainTicket benchmark demonstrate that Morphis reduces CPU consumption by 35-38% compared to state-of-the-art baselines while maintaining 98.8% SLO compliance.
翻译:现代微服务系统因其运行时的调用图随工作负载波动、故障响应及部署活动而持续发生结构演化。尽管存在这种复杂性,我们对字节跳动超过500,000条生产环境追踪记录的分析揭示了一种潜在的规律性:执行路径集中在少量重复出现的调用模式周围。然而,现有的资源管理方法未能利用这一结构特征。工业级自动扩缩器(如Kubernetes HPA)忽略了服务间的依赖关系,而近期的学术方法通常假设静态拓扑,导致其在动态执行环境下效果不佳。本研究提出Morphis,一种依赖感知的资源供给框架,它将模式感知的追踪分析与全局优化相统一。该框架引入了结构指纹技术,将追踪记录分解为稳定的执行主干与可解释的偏差子图。进而,资源分配被建模为基于预测模式分布的约束优化问题,在满足端到端尾部延迟服务等级目标(SLO)的同时,联合最小化总体CPU使用率。我们在TrainTicket基准测试上的广泛评估表明,与现有先进基线方法相比,Morphis在保持98.8% SLO合规率的同时,降低了35-38%的CPU消耗。