Scheduling a task graph representing an application over a heterogeneous network of computers is a fundamental problem in distributed computing. It is known to be not only NP-hard but also not polynomial-time approximable within a constant factor. As a result, many heuristic algorithms have been proposed over the past few decades. Yet it remains largely unclear how these algorithms compare to each other in terms of the quality of schedules they produce. We identify gaps in the traditional benchmarking approach to comparing task scheduling algorithms and propose a simulated annealing-based adversarial analysis approach called PISA to help address them. We also introduce SAGA, a new open-source library for comparing task scheduling algorithms. We use SAGA to benchmark 15 algorithms on 16 datasets and PISA to compare the algorithms in a pairwise manner. Algorithms that appear to perform similarly on benchmarking datasets are shown to perform very differently on adversarially chosen problem instances. Interestingly, the results indicate that this is true even when the adversarial search is constrained to selecting among well-structured, application-specific problem instances. This work represents an important step towards a more general understanding of the performance boundaries between task scheduling algorithms on different families of problem instances.
翻译:在异构计算机网络上调度表示应用程序的任务图是分布式计算中的一个基本问题。已知该问题不仅是NP难的,而且不存在常数因子内的多项式时间近似算法。因此,过去几十年提出了许多启发式算法。然而,这些算法在生成调度质量方面的相互比较在很大程度上仍不清楚。我们识别了传统基准测试方法在比较任务调度算法时存在的不足,并提出了一种基于模拟退火的对抗性分析方法PISA来帮助解决这些问题。我们还引入了SAGA,一个用于比较任务调度算法的开源新库。我们使用SAGA在16个数据集上对15种算法进行基准测试,并使用PISA以成对方式比较这些算法。在基准测试数据集上表现相似的算法,在对抗性选择的问题实例上表现出显著差异。有趣的是,即使将对抗性搜索限制在结构良好的特定应用问题实例中,这一现象仍然成立。这项工作向着更全面地理解不同问题实例族上任务调度算法之间的性能边界迈出了重要一步。