This paper addresses the challenge of deadline-aware online scheduling for jobs in hybrid cloud environments, where jobs may run on either cost-effective but unreliable spot instances or more expensive on-demand instances, under hard deadlines. We first establish a fundamental limit for existing (predominantly-) deterministic policies, proving a worst-case competitive ratio of $Ω(K)$, where $K$ is the cost ratio between on-demand and spot instances. We then present a novel randomized scheduling algorithm, ROSS, that achieves a provably optimal competitive ratio of $\sqrt{K}$ under reasonable deadlines, significantly improving upon existing approaches. Extensive evaluations on real-world trace data from Azure and AWS demonstrate that ROSS effectively balances cost optimization and deadline guarantees, consistently outperforming the state-of-the-art by up to $30\%$ in cost savings, across diverse spot market conditions.
翻译:本文研究了混合云环境中具有截止时间约束的作业在线调度问题,其中作业可在成本效益高但不可靠的竞价实例或价格更高的按需实例上运行,且需满足严格截止时间。我们首先证明了现有(主要基于)确定性策略存在理论局限,其最坏情况竞争比为$Ω(K)$,其中$K$为按需实例与竞价实例的成本比。随后提出一种新型随机调度算法ROSS,该算法在合理截止时间约束下可达到理论最优的$\sqrt{K}$竞争比,较现有方法实现显著提升。基于Azure与AWS真实追踪数据的大规模实验表明,ROSS能有效平衡成本优化与截止时间保障,在多样化的竞价市场环境下,其成本节约效果较现有最优方法提升最高达$30\%$。