Performance verification is a nascent but promising tool for understanding the performance and limitations of heuristics under realistic assumptions. Bespoke performance verification tools have already demonstrated their value in settings like congestion control and packet scheduling. In this paper, we aim to emphasize the broad applicability and utility of performance verification. To that end, we highlight the design principles of performance verification. Then, we leverage that understanding to develop a set of easy-to-follow guidelines that are applicable to a wide range of resource allocation heuristics. In particular, we introduce Virelay, a framework that enables heuristic designers to express the behavior of their algorithms and their assumptions about the system in an environment that resembles a discrete-event simulator. We demonstrate the utility and ease-of-use of Virelay by applying it to six diverse case studies. We produce bounds on the performance of classical algorithms, work stealing and SRPT scheduling, under practical assumptions. We demonstrate Virelay's expressiveness by capturing existing models for congestion control and packet scheduling, and we verify the observation that TCP unfairness can cause some ML training workloads to spontaneously converge to a state of high network utilization. Finally, we use Virelay to identify two bugs in the Linux CFS load balancer.
翻译:性能验证是一种新兴但前景广阔的工具,用于在现实假设下理解启发式算法的性能与局限性。定制化的性能验证工具已在拥塞控制、数据包调度等领域展现出独特价值。本文旨在强调性能验证的广泛适用性与实用性。为此,我们首先阐明性能验证的设计原则,并基于这些原则构建了一套易于遵循、适用于多种资源分配启发式算法的通用指南。特别地,我们提出了Virelay框架,该框架可使启发式算法设计者在类似离散事件模拟器的环境中表达算法行为及对系统的假设。通过对六个不同案例的应用,我们验证了Virelay的实用性与易用性:在实用假设下,我们推导出经典算法、工作窃取算法及SRPT调度策略的性能界;通过捕获拥塞控制与数据包调度的现有模型,展示了Virelay的表达能力,并验证了TCP不公平性可能导致某些机器学习训练负载自发收敛至网络高利用率状态的结论。最后,我们利用Virelay识别出Linux CFS负载均衡器中的两个缺陷。