TrackFormers：面向高亮度大型强子对撞机时代的基于Transformer的粒子径迹重建方法探索 (TrackFormers: In Search of Transformer-Based Particle Tracking for the High-Luminosity LHC Era)

High-Energy Physics experiments are facing a multi-fold data increase with every new iteration. This is certainly the case for the upcoming High-Luminosity LHC upgrade. Such increased data processing requirements forces revisions to almost every step of the data processing pipeline. One such step in need of an overhaul is the task of particle track reconstruction, a.k.a., tracking. A Machine Learning-assisted solution is expected to provide significant improvements, since the most time-consuming step in tracking is the assignment of hits to particles or track candidates. This is the topic of this paper. We take inspiration from large language models. As such, we consider two approaches: the prediction of the next word in a sentence (next hit point in a track), as well as the one-shot prediction of all hits within an event. In an extensive design effort, we have experimented with three models based on the Transformer architecture and one model based on the U-Net architecture, performing track association predictions for collision event hit points. In our evaluation, we consider a spectrum of simple to complex representations of the problem, eliminating designs with lower metrics early on. We report extensive results, covering both prediction accuracy (score) and computational performance. We have made use of the REDVID simulation framework, as well as reductions applied to the TrackML data set, to compose five data sets from simple to complex, for our experiments. The results highlight distinct advantages among different designs in terms of prediction accuracy and computational performance, demonstrating the efficiency of our methodology. Most importantly, the results show the viability of a one-shot encoder-classifier based Transformer solution as a practical approach for the task of tracking.

翻译：高能物理实验的数据量随着每次升级迭代呈多倍增长，即将到来的高亮度大型强子对撞机升级计划正是如此。数据处理需求的急剧增加迫使数据处理流程中几乎每个环节都需要重新审视。其中亟待革新的环节是粒子径迹重建任务（即径迹追踪）。由于径迹追踪中最耗时的步骤是将探测器命中点分配给粒子或径迹候选者，机器学习辅助解决方案有望带来显著改进。这正是本文研究的主题。我们受到大语言模型的启发，考虑两种方法：预测句子中的下一个单词（即径迹中的下一个命中点），以及一次性预测整个事件中的所有命中点。通过广泛的设计研究，我们实验了三种基于Transformer架构的模型和一种基于U-Net架构的模型，用于对撞事件命中点的径迹关联预测。在评估中，我们考虑了从简单到复杂的问题表示形式，并早期淘汰了指标较低的设计方案。我们报告了涵盖预测准确度（分数）和计算性能的广泛结果。实验中，我们利用REDVID模拟框架以及对TrackML数据集的简化处理，构建了五个从简单到复杂的数据集。结果凸显了不同设计在预测精度和计算性能方面的独特优势，证明了我们方法的有效性。最重要的是，结果表明基于单次编码-分类器的Transformer解决方案作为径迹追踪任务的实用方法是可行的。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日