Programmable packet-processing pipelines are a core building block of modern SmartNICs and switches, yet their design requires navigating intertwined trade-offs among program feasibility, hardware cost, and system-level performance. Existing approaches rely on proxy metrics such as stage or ALU count, which often mispredict capability and end-to-end behavior. We present Kugelblitz, a framework for executable, cost-aware design-space exploration of programmable packet pipelines. Kugelblitz decouples packet-processing programs from pipeline architectures and uses compiler-based feasibility checking to prune designs that cannot support target workloads. For feasible architectures, Kugelblitz automatically generates synthesizable RTL, enabling synthesis-backed area and timing estimation and cycle-accurate full-system evaluation with real application workloads. Using representative programs including NAT, firewalling, and an in-network key-value cache, we show that proxy metrics substantially overestimate capability, that performance rankings change under system-level evaluation, and that the cost of supporting richer workloads is highly non-linear.
翻译:可编程数据包处理流水线是现代智能网卡和交换机的核心构建模块,但其设计需要权衡程序可行性、硬件成本和系统级性能之间相互交织的折衷关系。现有方法依赖于阶段数或算术逻辑单元数量等代理指标,这些指标常常错误预测流水线能力和端到端行为。本文提出Kugelblitz,一个用于可编程数据包流水线的可执行、成本感知设计空间探索框架。Kugelblitz将数据包处理程序与流水线架构解耦,并采用基于编译器的可行性检查来剔除无法支持目标工作负载的设计方案。对于可行的架构,Kugelblitz自动生成可综合的寄存器传输级代码,从而实现基于综合的面积与时序评估,以及使用真实应用工作负载的周期精确全系统评估。通过包含网络地址转换、防火墙和网内键值缓存等代表性程序进行实验,我们发现代理指标会显著高估流水线能力,系统级评估下的性能排序会发生改变,且支持更复杂工作负载的成本具有高度非线性特征。