Sharp Waiting-Time Bounds for Multiserver Jobs

Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in today's computing clusters. But little is known about the delay performance of systems with multiserver jobs. We consider queueing models for multiserver jobs in scaling regimes where the system load becomes heavy and meanwhile the total number of servers in the system and the number of servers that a job needs become large. Prior work has derived upper bounds on the queueing probability in this scaling regime. However, without proper lower bounds, the existing results cannot be used to differentiate between policies. In this paper, we study the delay performance by establishing sharp bounds on the mean waiting time of multiserver jobs, where the waiting time of a job is the time spent in queueing rather than in service. We first characterize the exact order of the mean waiting time under the First-Come-First-Serve (FCFS) policy. Then we prove a lower bound on the mean waiting time of all policies, which has an order gap with the mean waiting time under FCFS. Finally, we show that the lower bound is achievable under a priority policy that we call Smallest-Need-First (SNF).

翻译：多服务器任务（即同时占用多个服务器执行的任务）在现代计算集群中普遍存在，然而人们对含多服务器任务系统的延迟性能知之甚少。本文研究系统负载加重且系统服务器总数与任务所需服务器数量均趋于大规模场景下的多服务器任务排队模型。现有工作已推导出该尺度下的排队概率上界，但由于缺乏适当的下界，现有结果无法用于区分不同策略。本文通过建立多服务器任务平均等待时间（即队列中而非服务中的耗时）的锐利界，系统研究延迟性能。首先刻画先来先服务（FCFS）策略下平均等待时间的精确量级；随后证明所有策略下平均等待时间的下界，该下界与FCFS策略下的平均等待时间存在量级差距；最终证明该下界可通过我们称为最小需求优先（SNF）的优先级策略实现。

相关内容

服务器

关注 14

服务器，也称伺服器，是提供计算服务的设备。由于服务器需要响应服务请求，并进行处理，因此一般来说服务器应具备承担服务并且保障服务的能力。
服务器的构成包括处理器、硬盘、内存、系统总线等，和通用的计算机架构类似，但是由于需要提供高可靠的服务，因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。

【硬核书】Linux 基础第二版，500页pdf

专知会员服务

90+阅读 · 2022年9月12日

《JADC2 Update—— The What to the How》美国国防信息系统局（DISA）10页slides

专知会员服务

51+阅读 · 2022年6月8日

【ICML2021】REPAINT:深度强化学习中的知识迁移

专知会员服务

23+阅读 · 2021年9月5日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日