Subatomic particle track reconstruction (tracking) is a vital task in High-Energy Physics experiments. Tracking is exceptionally computationally challenging and fielded solutions, relying on traditional algorithms, do not scale linearly. Machine Learning (ML) assisted solutions are a promising answer. We argue that a complexity-reduced problem description and the data representing it, will facilitate the solution exploration workflow. We provide the REDuced VIrtual Detector (REDVID) as a complexity-reduced detector model and particle collision event simulator combo. REDVID is intended as a simulation-in-the-loop, to both generate synthetic data efficiently and to simplify the challenge of ML model design. The fully parametric nature of our tool, with regards to system-level configuration, while in contrast to physics-accurate simulations, allows for the generation of simplified data for research and education, at different levels. Resulting from the reduced complexity, we showcase the computational efficiency of REDVID by providing the computational cost figures for a multitude of simulation benchmarks. As a simulation and a generative tool for ML-assisted solution design, REDVID is highly flexible, reusable and open-source. Reference data sets generated with REDVID are publicly available. Data generated using REDVID has enabled rapid development of multiple novel ML model designs, which is currently ongoing.
翻译:亚原子粒子径迹重建(追踪)是高能物理实验中的关键任务。追踪过程的计算强度极高,且依赖传统算法的现有方案无法线性扩展。机器学习(ML)辅助方案是极具前景的解决方案。我们认为,通过降低问题描述的复杂度并简化相关数据表征,能够有效促进解决方案的探索流程。我们提出降维虚拟探测器(REDVID)这一复杂度降低的探测器模型与粒子碰撞事件模拟器组合工具。REDVID被设计为一种闭环模拟系统,既能高效生成合成数据,又能简化机器学习模型设计的挑战。该工具在系统级配置方面具有完全的参数化特性——虽与物理精确模拟形成对比——却能生成适用于不同层次研究与教育的简化数据。得益于复杂度降低的优势,我们通过提供多项模拟基准的计算成本数据,展示了REDVID的计算效率。作为机器学习辅助方案设计的模拟与生成工具,REDVID具备高度灵活性、可复用性及开源特性。基于REDVID生成的参考数据集已公开可用。利用REDVID生成的数据已推动多项新型机器学习模型设计的快速开发,相关工作仍在持续推进中。