We study the problem of scheduling delay-sensitive jobs over spot and on-demand cloud instances to minimize average cost while meeting an average delay constraint. Jobs arrive as a general stochastic process, and incur different costs based on the instance type. This work provides the first analytical treatment of this problem using tools from queuing theory, stochastic processes, and optimization. We derive cost expressions for general policies, prove queue length one is optimal for low target delays, and characterize the optimal wait-time distribution. For high target delays, we identify a knapsack structure and design a scheduling policy that exploits it. An adaptive algorithm is proposed to fully utilize the allowed delay, and empirical results confirm its near-optimality.
翻译:本研究探讨了在竞价实例与按需实例上调度时延敏感型作业的问题,目标是在满足平均时延约束的前提下最小化平均成本。作业以一般随机过程形式到达,且根据实例类型产生不同成本。本文首次运用排队论、随机过程与优化理论对该问题进行解析研究。我们推导了通用策略的成本表达式,证明了在低目标时延下队列长度为1是最优的,并刻画了最优等待时间分布特征。针对高目标时延场景,我们揭示了该问题具有背包问题结构,并设计了利用此特性的调度策略。进一步提出能充分利用允许时延的自适应算法,实证结果验证了其接近最优的性能。