Today's distributed systems operate in complex environments that inevitably involve faults and even adversarial behaviors. Predicting their performance under such environments directly from formal designs remains a longstanding challenge. We present the first formal framework that systematically enables performance prediction of distributed systems across diverse faulty scenarios. Our framework features a fault injector together with a wide range of faults, reusable as a library, and model compositions that integrate the system and the fault injector into a unified model suitable for statistical analysis of performance properties such as throughput and latency. We formalize the framework in Maude and implement it as an automated tool, PERF. Applied to representative distributed systems, PERF accurately predicts system performance under varying fault settings, with estimations from formal designs consistent with evaluations on real deployments.
翻译:当今分布式系统运行在复杂环境中,不可避免地涉及故障甚至对抗行为。直接从形式化设计预测此类环境下的系统性能,仍是一个长期存在的挑战。我们提出了首个形式化框架,能够系统性地实现分布式系统在多样化故障场景下的性能预测。该框架包含一个故障注入器及多种可复用为库的故障类型,并通过模型组合将系统与故障注入器集成至统一模型中,适用于吞吐量、延迟等性能属性的统计分析。我们在Maude中形式化该框架,并将其实现为自动化工具PERF。应用于典型分布式系统时,PERF能准确预测不同故障配置下的系统性能,其基于形式化设计的评估结果与实际部署中的性能表现保持一致。