We study the classical scheduling problem on parallel machines %with precedence constraints where the precedence graph has the bounded depth $h$. Our goal is to minimize the maximum completion time. We focus on developing approximation algorithms that use only sublinear space or sublinear time. We develop the first one-pass streaming approximation schemes using sublinear space when all jobs' processing times differ no more than a constant factor $c$ and the number of machines $m$ is at most $\tfrac {2n \epsilon}{3 h c }$. This is so far the best approximation we can have in terms of $m$, since no polynomial time approximation better than $\tfrac{4}{3}$ exists when $m = \tfrac{n}{3}$ unless P=NP. %the problem cannot be approximated within a factor of $\tfrac{4}{3}$ when $m = \tfrac{n}{3}$ even if all jobs have equal processing time. The algorithms are then extended to the more general problem where the largest $\alpha n$ jobs have no more than $c$ factor difference. % for some constant $0 < \alpha \le 1$. We also develop the first sublinear time algorithms for both problems. For the more general problem, when $ m \le \tfrac { \alpha n \epsilon}{20 c^2 \cdot h } $, our algorithm is a randomized $(1+\epsilon)$-approximation scheme that runs in sublinear time. This work not only provides an algorithmic solution to the studied problem under big data % and cloud computing environment, but also gives a methodological framework for designing sublinear approximation algorithms for other scheduling problems.
翻译:我们研究经典并行机调度问题,其中优先图具有有界深度$h$,目标是最小化最大完工时间。我们致力于开发仅使用次线性空间或次线性时间的近似算法。针对所有作业处理时间之差不超过常数因子$c$且机器数量$m$满足$m \leq \tfrac {2n \epsilon}{3 h c }$的情形,我们首次提出了使用次线性空间的单遍流近似方案。这是目前关于$m$可达到的最佳近似结果,因为当$m = \tfrac{n}{3}$时(即使所有作业处理时间相同),除非P=NP,否则不存在优于$\tfrac{4}{3}$的多项式时间近似算法。随后我们将算法推广至更一般的问题:最大的$\alpha n$个作业的处理时间差异不超过因子$c$。针对这两个问题,我们还首次提出了次线性时间算法。对于更一般的问题,当$m \leq \tfrac { \alpha n \epsilon}{20 c^2 \cdot h }$时,我们的算法是一种运行于次线性时间的随机$(1+\epsilon)$-近似方案。本研究不仅为大数据及云计算环境下的调度问题提供了算法解决方案,也为设计其他调度问题的次线性近似算法提供了方法论框架。