Modern microservice systems exhibit continuous structural evolution in their runtime call graphs due to workload fluctuations, fault responses, and deployment activities. Despite this complexity, our analysis of over 500,000 production traces from ByteDance reveals a latent regularity: execution paths concentrate around a small set of recurring invocation patterns. However, existing resource management approaches fail to exploit this structure. Industrial autoscalers like Kubernetes HPA ignore inter-service dependencies, while recent academic methods often assume static topologies, rendering them ineffective under dynamic execution contexts. In this work, we propose Morphis, a dependency-aware provisioning framework that unifies pattern-aware trace analysis with global optimization. It introduces structural fingerprinting that decomposes traces into a stable execution backbone and interpretable deviation subgraphs. Then, resource allocation is formulated as a constrained optimization problem over predicted pattern distributions, jointly minimizing aggregate CPU usage while satisfying end-to-end tail-latency SLOs. Our extensive evaluations on the TrainTicket benchmark demonstrate that Morphis reduces CPU consumption by 35-38% compared to state-of-the-art baselines while maintaining 98.8% SLO compliance.
翻译:现代微服务系统因其工作负载波动、故障响应及部署活动,其运行时调用图持续发生结构演化。尽管存在这种复杂性,我们对字节跳动超过50万条生产环境追踪的分析揭示了一种潜在规律:执行路径集中于少量重复出现的调用模式。然而,现有的资源管理方法未能利用此结构特征。工业级自动扩缩器(如Kubernetes HPA)忽略服务间依赖关系,而近期学术方法常假设静态拓扑,导致其在动态执行场景下失效。本研究提出Morphis——一个依赖感知的资源供给框架,将模式感知的追踪分析与全局优化相统一。该框架引入结构指纹技术,将追踪分解为稳定的执行主干与可解释的偏差子图。进而将资源分配建模为基于预测模式分布的约束优化问题,在满足端到端尾延迟SLO的同时,联合最小化聚合CPU使用率。我们在TrainTicket基准测试上的广泛评估表明,相较于前沿基线方法,Morphis在保持98.8% SLO达标率的同时,可降低35-38%的CPU消耗。