We present Prequal (Probing to Reduce Queuing and Latency), a load balancer for distributed multi-tenant systems. Prequal aims to minimize real-time request latency in the presence of heterogeneous server capacities and non-uniform, time-varying antagonist load. It actively probes server load to leverage the power-of-d-choices paradigm, extending it with asynchronous and reusable probes. Cutting against received wisdom, Prequal does not balance CPU load, but instead selects servers according to estimated latency and active requests-in-flight (RIF). We explore its major design features on a testbed system and evaluate it on YouTube, where it has been deployed for more than two years. Prequal has dramatically decreased tail latency, error rates, and resource use, enabling YouTube and other production systems at Google to run at much higher utilization.
翻译:摘要:我们提出Prequal(通过探测减少排队与延迟),一种面向分布式多租户系统的负载均衡器。Prequal旨在异构服务器容量与非均匀、时变对抗性负载场景下最小化实时请求延迟。它通过主动探测服务器负载来利用"d选择幂"范式,并引入异步与可复用探针对其进行了扩展。与传统认知相悖,Prequal不均衡CPU负载,而是根据预估延迟与活跃在途请求数(RIF)选择服务器。我们在测试平台系统上探究了其主要设计特征,并在YouTube上对其进行了评估——该系统已在该平台部署超过两年。Prequal显著降低了尾延迟、错误率及资源消耗,使YouTube及谷歌其他生产系统能够在更高利用率下运行。