在线流时间最小化与逐步揭示作业 (Online Flow Time Minimization with Gradually Revealed Jobs)

We consider the problem of online preemptive scheduling on a single machine to minimize the total flow time. In clairvoyant scheduling, where job processing times are revealed upon arrival, the Shortest Remaining Processing Time (SRPT) algorithm is optimal. In practice, however, exact processing times are often unknown. At the opposite extreme, non-clairvoyant scheduling, in which processing times are revealed only upon completion, suffers from strong lower bounds on the competitive ratio. This motivates the study of intermediate information models. We introduce a new model in which processing times are revealed gradually during execution. Each job consists of a sequence of operations, and the processing time of an operation becomes known only after the preceding one completes. This models many scheduling scenarios that arise in computing systems. Our main result is a deterministic $O(m^2)$-competitive algorithm, where $m$ is the maximum number of operations per job. More specifically, we prove a refined competitive ratio in $O(m_1 \cdot m_2)$, where $m_1$ and $m_2$ are instance-dependent parameters describing the operation size structure. Our algorithm and analysis build on recent advancements in robust flow time minimization (SODA '26), where jobs arrive with estimated sizes. However, in our setting we have no bounded estimate on a job's processing time. Thus, we design a highly adaptive algorithm that gradually explores a job's operations while working on them, and groups them into virtual chunks whose size can be well-estimated. This is a crucial ingredient of our result and requires a much more careful analysis compared to the robust setting. We also provide lower bounds showing that our bounds are essentially best possible. For the special case of scheduling with uniform obligatory tests, we show that SRPT at the operation level is $2$-competitive, which is best possible.

翻译：我们考虑在单机上在线抢占式调度以最小化总流时间的问题。在具有完全信息（clairvoyant）的调度中，作业处理时间在到达时即被揭示，最短剩余处理时间（SRPT）算法是最优的。然而在实践中，精确的处理时间往往是未知的。在另一个极端，非完全信息（non-clairvoyant）调度中，处理时间仅在作业完成时才被揭示，这导致其竞争比存在很强的下界。这促使了对中间信息模型的研究。我们引入了一种新模型，其中处理时间在执行过程中逐步揭示。每个作业由一系列操作组成，且一个操作的处理时间仅在前一个操作完成后才变为已知。这模拟了计算系统中出现的许多调度场景。我们的主要结果是提出一个确定性的 $O(m^2)$-竞争算法，其中 $m$ 是每个作业的最大操作数。更具体地说，我们证明了一个精细的竞争比 $O(m_1 \cdot m_2)$，其中 $m_1$ 和 $m_2$ 是描述操作规模结构的实例相关参数。我们的算法和分析建立在流时间鲁棒最小化（SODA '26）的最新进展之上，该工作中作业到达时带有估计的规模。然而，在我们的设置中，我们对作业的处理时间没有有界的估计。因此，我们设计了一个高度自适应的算法，该算法在处理作业的同时逐步探索其操作，并将它们分组为可以较好估计规模的虚拟块。这是我们结果的关键组成部分，并且与鲁棒设置相比，需要更加仔细的分析。我们还提供了下界，表明我们的界限本质上是最优的。对于具有统一强制性测试的特殊调度情况，我们证明了在操作级别上使用 SRPT 算法是 $2$-竞争的，这是最优的。