The Transformer Machine Learning (ML) architecture has been gaining considerable momentum in recent years. In particular, computational High-Energy Physics tasks such as jet tagging and particle track reconstruction (tracking), have either achieved proper solutions, or reached considerable milestones using Transformers. On the other hand, the use of specialised hardware accelerators, especially FPGAs, is an effective method to achieve online, or pseudo-online latencies. The development and integration of Transformer-based ML to FPGAs is still ongoing and the support from current tools is very limited or non-existent. Additionally, FPGA resources present a significant constraint. Considering the model size alone, while smaller models can be deployed directly, larger models are to be partitioned in a meaningful and ideally, automated way. We aim to develop methodologies and tools for monolithic, or partitioned Transformer synthesis, specifically targeting inference. Our primary use-case involves two machine learning model designs for tracking, derived from the TrackFormers project. We elaborate our development approach, present preliminary results, and provide comparisons.
翻译:近年来,Transformer机器学习架构获得了显著的发展势头。特别是在计算高能物理任务中,如喷注标记和粒子径迹重建,已通过Transformer实现了有效解决方案或取得了重要进展。另一方面,采用专用硬件加速器(尤其是FPGA)是实现在线或伪在线延迟的有效方法。目前,基于Transformer的机器学习在FPGA上的开发与集成仍在进行中,现有工具的支持非常有限甚至缺失。此外,FPGA资源存在显著限制。仅考虑模型规模,虽然较小模型可直接部署,但较大模型需要以合理且理想情况下自动化的方式进行划分。我们致力于开发面向推理的、整体或分块Transformer综合方法与工具。我们的主要用例涉及源自TrackFormers项目的两种用于径迹重建的机器学习模型设计。本文详细阐述了开发方法,展示了初步结果,并提供了对比分析。