Everything Matters in Programmable Packet Scheduling

Programmable packet scheduling allows the deployment of scheduling algorithms into existing switches without need for hardware redesign. Scheduling algorithms are programmed by tagging packets with ranks, indicating their desired priority. Programmable schedulers then execute these algorithms by serving packets in the order described in their ranks. The ideal programmable scheduler is a Push-In First-Out (PIFO) queue, which achieves perfect packet sorting by pushing packets into arbitrary positions in the queue, while only draining packets from the head. Unfortunately, implementing PIFO queues in hardware is challenging due to the need to arbitrarily sort packets at line rate based on their ranks. In the last years, various techniques have been proposed, approximating PIFO behaviors using the available resources of existing data planes. While promising, approaches to date only approximate one of the characteristic behaviors of PIFO queues (i.e., its scheduling behavior, or its admission control). We propose PACKS, the first programmable scheduler that fully approximates PIFO queues on all their behaviors. PACKS does so by smartly using a set of strict-priority queues. It uses packet-rank information and queue-occupancy levels at enqueue to decide: whether to admit packets to the scheduler, and how to map admitted packets to the different queues. We fully implement PACKS in P4 and evaluate it on real workloads. We show that PACKS: better-approximates PIFO than state-of-the-art approaches and scales. We also show that PACKS runs at line rate on existing hardware (Intel Tofino).

翻译：可编程数据包调度允许在不需重新设计硬件的情况下，将调度算法部署到现有交换机中。调度算法通过为数据包标记优先级等级（ranks）来编程，指示其期望的优先级。可编程调度器随后根据数据包等级指定的顺序执行这些算法。理想的调度器是推入先进先出（Push-In First-Out, PIFO）队列，该队列通过将数据包推入队列中任意位置实现完美排序，同时仅从队头取出数据包。然而，由于需要根据数据包等级在线速下进行任意排序，在硬件中实现PIFO队列具有挑战性。近年来，研究者提出了多种技术，利用现有数据平面的可用资源近似PIFO行为。尽管这些方法前景广阔，但目前仅能近似PIFO队列的某一种特征行为（即调度行为或准入控制）。我们提出PACKS——首个在所有行为上完全近似PIFO队列的可编程调度器。PACKS通过智能地使用一组严格优先级队列实现这一目标。它在入队时利用数据包等级信息和队列占用水平来决定：是否允许数据包进入调度器，以及如何将准入的数据包映射到不同队列。我们在P4中完整实现了PACKS，并在真实工作负载上对其进行评估。结果表明：PACKS比现有最先进方法更好地近似PIFO且可扩展。我们还证明PACKS能在现有硬件（Intel Tofino）上以线速运行。