In high-energy physics, the increasing luminosity and detector granularity at the Large Hadron Collider are driving the need for more efficient data processing solutions. Machine Learning has emerged as a promising tool for reconstructing charged particle tracks, due to its potentially linear computational scaling with detector hits. The recent implementation of a graph neural network-based track reconstruction pipeline in the first level trigger of the LHCb experiment on GPUs serves as a platform for comparative studies between computational architectures in the context of high-energy physics. This paper presents a novel comparison of the throughput of ML model inference between FPGAs and GPUs, focusing on the first step of the track reconstruction pipeline$\unicode{x2013}$an implementation of a multilayer perceptron. Using HLS4ML for FPGA deployment, we benchmark its performance against the GPU implementation and demonstrate the potential of FPGAs for high-throughput, low-latency inference without the need for an expertise in FPGA development and while consuming significantly less power.
翻译:在高能物理领域,大型强子对撞机不断提升的亮度与探测器精细度,对更高效的数据处理方案提出了迫切需求。机器学习因其计算复杂度可能随探测器击中数线性增长的特性,已成为带电粒子径迹重建的有力工具。近期LHCb实验在第一级触发系统中基于GPU实现的图神经网络径迹重建流程,为高能物理场景下计算架构的对比研究提供了平台。本文首次对比了FPGA与GPU在机器学习模型推理吞吐量方面的表现,聚焦于径迹重建流程的第一步——多层感知机的实现。通过使用HLS4ML工具进行FPGA部署,我们将其性能与GPU实现进行基准测试,结果表明FPGA在无需专业开发知识且功耗显著降低的前提下,具备实现高吞吐量、低延迟推理的潜力。