The Transformer Machine Learning (ML) architecture has been gaining considerable momentum in recent years. In particular, computational High-Energy Physics tasks such as jet tagging and particle track reconstruction (tracking), have either achieved proper solutions, or reached considerable milestones using Transformers. On the other hand, the use of specialised hardware accelerators, especially FPGAs, is an effective method to achieve online, or pseudo-online latencies. The development and integration of Transformer-based ML to FPGAs is still ongoing and the support from current tools is very limited to non-existent. Additionally, FPGA resources present a significant constraint. Considering the model size alone, while smaller models can be deployed directly, larger models are to be partitioned in a meaningful and ideally, automated way. We aim to develop methodologies and tools for monolithic, or partitioned Transformer synthesis, specifically targeting inference. Our primary use-case involves two machine learning model designs for tracking, derived from the TrackFormers project. We elaborate our development approach, present preliminary results, and provide comparisons.
翻译:Transformer机器学习架构近年来获得了显著的发展势头。特别是在计算高能物理任务中,如喷注重建和粒子径迹重建(追踪),已通过Transformer架构实现了有效的解决方案或取得了重要进展。另一方面,使用专用硬件加速器(尤其是FPGA)是实现在线或伪在线延迟的有效方法。然而,将基于Transformer的机器学习模型开发并集成到FPGA上的工作仍在进行中,当前工具的支持非常有限甚至缺失。此外,FPGA资源构成了显著约束。仅考虑模型规模,虽然较小模型可直接部署,但较大模型需要以有意义且理想情况下自动化的方式进行划分。我们的目标是开发用于整体或划分式Transformer综合的方法与工具,特别专注于推理任务。我们的主要用例涉及两种用于粒子追踪的机器学习模型设计,均源自TrackFormers项目。我们将详细阐述开发方法,展示初步结果,并提供对比分析。