Nowadays, multi-server jobs, which request multiple computing devices and hold onto them during their execution, dominate modern computing clusters. When allocating computing devices to them, it is difficult to make the tradeoff between the parallel computation gains and the internal communication overheads. Firstly, the computing gain does not increase linearly with computing devices. Secondly, the device type which dominates the communication overhead is various to different job types. To achieve a better gain-overhead tradeoff, we formulate an accumulative reward maximization program and design an online algorithm, i.e., OGASched, to schedule multi-server jobs. The reward of a job is formulated as the parallel computation gain aggregated over the allocated computing devices minus the penalty on the dominant communication overhead. OGASched allocates computing devices to each arrived job in the ascending direction of the reward gradients. OGASched has a best-so-far regret with concave rewards, which grows sublinearly with the number of job types and the time slot length. OGASched has several parallel sub-procedures to accelerate its computation, which greatly reduces the complexity. We conduct extensive trace-driven simulations to validate the performance of OGASched. The results demonstrate that OGASched outperforms widely used heuristics by $11.33\%$, $7.75\%$, $13.89\%$, and $13.44\%$, respectively.
翻译:如今,多服务器作业(请求多个计算设备并在执行期间独占这些设备)已成为现代计算集群的主流。在为其分配计算设备时,需要在并行计算增益与内部通信开销之间进行权衡。首先,计算增益并不随计算设备数量线性增长;其次,不同作业类型主导通信开销的设备类型各异。为实现更优的增益-开销权衡,我们构建了一个累积奖励最大化模型,并设计了一种在线算法OGASched,用于调度多服务器作业。作业的奖励被定义为:分配到计算设备产生的并行计算增益总和减去主导通信开销对应的惩罚值。OGASched沿奖励梯度的上升方向,为每个到达的作业分配计算设备。在凹性奖励条件下,OGASched具有最佳遗憾界,该遗憾随作业类型数量和时间槽长度呈亚线性增长。此外,OGASched包含多个并行子过程以加速计算,大幅降低了复杂度。我们通过大量轨迹驱动的仿真验证了OGASched的性能,结果表明,该算法相较于广泛使用的启发式方法分别提升性能11.33%、7.75%、13.89%和13.44%。