We study the problem of scheduling an arbitrary computational DAG on a fixed number of processors while minimizing the makespan. While previous works have mostly studied this problem in fairly restricted models, we define and analyze DAG scheduling in the Bulk Synchronous Parallel (BSP) model, which is a well-established parallel computing model that captures the communication cost between processors much more accurately. We provide a taxonomy of simpler scheduling models that can be understood as variants or special cases of BSP, and discuss how the properties and optimum cost of these models relate to BSP. This essentially allows us to dissect the different building blocks of the BSP model, and gain insight into how these influence the scheduling problem. We then analyze the hardness of DAG scheduling in BSP in detail. We show that the problem is solvable in polynomial time for some very simple classes of DAGs, but it is already NP-hard for in-trees or DAGs of height 2. We also prove that in general DAGs, the problem is APX-hard: it cannot be approximated to a $(1+\epsilon)$-factor in polynomial time for some specific $\epsilon>0$. We then separately study the subproblem of scheduling communication steps, and we show that the NP-hardness of this problem depends on the problem parameters and the communication rules within the BSP model. Finally, we present and analyze a natural formulation of our scheduling task as an Integer Linear Program.
翻译:本文研究在固定处理器数量下调度任意计算有向无环图以最小化完工时间的问题。以往研究大多在限制性较强的模型中进行探讨,我们则在批量同步并行计算模型中定义并分析了有向无环图调度问题——该模型作为成熟的并行计算模型,能更精确地刻画处理器间的通信开销。我们建立了一套更简单调度模型的分类体系,这些模型可视为BSP模型的变体或特例,并讨论了这些模型的性质及最优成本与BSP模型的关联机制。这使我们能够解构BSP模型的不同组成模块,深入理解各模块对调度问题的影响机理。随后我们详细分析了BSP模型中有向无环图调度问题的计算复杂性:证明对于某些极简单的有向无环图类别该问题存在多项式时间解法,但对于入树结构或高度为2的有向无环图已是NP难问题。进一步证明在一般有向无环图中该问题具有APX难度:对于特定$\epsilon>0$,不存在多项式时间的$(1+\epsilon)$倍近似算法。我们继而单独研究通信步骤调度的子问题,证明该问题的NP难度取决于BSP模型中的问题参数与通信规则。最后,我们提出将调度任务构建为整数线性规划的自然表述方法并进行了理论分析。