The former CMS Run 2 High Level Trigger (HLT) farm is one of the largest contributors to CMS compute resources, providing about 25k job slots for offline computing. This CPU farm was initially employed as an opportunistic resource, exploited during inter-fill periods, in the LHC Run 2. Since then, it has become a nearly transparent extension of the CMS capacity at CERN, being located on-site at the LHC interaction point 5 (P5), where the CMS detector is installed. This resource has been configured to support the execution of critical CMS tasks, such as prompt detector data reconstruction. It can therefore be used in combination with the dedicated Tier 0 capacity at CERN, in order to process and absorb peaks in the stream of data coming from the CMS detector. The initial configuration for this resource, based on statically configured VMs, provided the required level of functionality. However, regular operations of this cluster revealed certain limitations compared to the resource provisioning and use model employed in the case of WLCG sites. A new configuration, based on a vacuum-like model, has been implemented for this resource in order to solve the detected shortcomings. This paper reports about this redeployment work on the permanent cloud for an enhanced support to CMS offline computing, comparing the former and new models' respective functionalities, along with the commissioning effort for the new setup.
翻译:原CMS Run 2高能级触发(HLT)集群是CMS计算资源的最大贡献者之一,为离线计算提供了约25,000个作业槽位。在LHC Run 2期间,该CPU集群最初被用作机会性资源,主要在束流填充间隔期间利用。此后,它已成为CERN CMS计算能力的近乎透明的扩展,位于LHC第五相互作用点(P5)现场,即CMS探测器安装位置。该资源已配置为支持执行关键的CMS任务,例如探测器数据的即时重建。因此,它可以与CERN专用的Tier 0计算能力结合使用,以处理并吸收来自CMS探测器的数据流峰值。该资源的初始配置基于静态配置的虚拟机,提供了所需的功能水平。然而,与WLCG站点采用的资源供给和使用模型相比,该集群的常规运行暴露出某些局限性。为解决已发现的不足,已为该资源实施了一种基于类真空模型的新配置。本文报告了在永久云上进行的此次重新部署工作,以增强对CMS离线计算的支持,比较了新旧模型各自的功能,并阐述了新设置的调试工作。