The online Data Quality Monitoring system (DQM) of the CMS electromagnetic calorimeter (ECAL) is a crucial operational tool that allows ECAL experts to quickly identify, localize, and diagnose a broad range of detector issues that would otherwise hinder physics-quality data taking. Although the existing ECAL DQM system has been continuously updated to respond to new problems, it remains one step behind newer and unforeseen issues. Using unsupervised deep learning, a real-time autoencoder-based anomaly detection system is developed that is able to detect ECAL anomalies unseen in past data. After accounting for spatial variations in the response of the ECAL and the temporal evolution of anomalies, the new system is able to efficiently detect anomalies while maintaining an estimated false discovery rate between $10^{-2}$ to $10^{-4}$, beating existing benchmarks by about two orders of magnitude. The real-world performance of the system is validated using anomalies found in 2018 and 2022 LHC collision data. Additionally, first results from deploying the autoencoder-based system in the CMS online DQM workflow for the ECAL barrel during Run 3 of the LHC are presented, showing its promising performance in detecting obscure issues that could have been missed in the existing DQM system.
翻译:CMS电磁量能器(ECAL)的在线数据质量监测系统(DQM)是一个关键运行工具,使ECAL专家能够快速识别、定位和诊断各种探测器问题,否则这些问题会阻碍高质量物理数据的采集。尽管现有的ECAL DQM系统不断更新以应对新问题,但它仍滞后于更新、更不可预见的问题。利用无监督深度学习,开发了一种基于自编码器的实时异常检测系统,能够检测过去数据中未见过的ECAL异常。在考虑了ECAL响应的空间变化和异常的时间演变后,新系统能高效检测异常,同时将估计的误检率维持在$10^{-2}$至$10^{-4}$之间,比现有基准改进约两个数量级。通过2018年和2022年LHC对撞数据中的异常,验证了该系统的实际性能。此外,还展示了在LHC Run 3期间将基于自编码器的系统部署于CMS在线DQM工作流中用于ECAL桶部监测的首批结果,表明其在检测现有DQM系统可能遗漏的隐蔽问题方面具有良好前景。