Conditions data is the subset of non-event data that is necessary to process event data. It poses a unique set of challenges, namely a heterogeneous structure and high access rates by distributed computing. The HSF Conditions Databases activity is a forum for cross-experiment discussions inviting as broad a participation as possible. It grew out of the HSF Community White Paper work to study conditions data access, where experts from ATLAS, Belle II, and CMS converged on a common language and proposed a schema that represents best practice. Following discussions with a broader community, including NP as well as HEP experiments, a core set of use cases, functionality and behaviour was defined with the aim to describe a core conditions database API. This paper will describe the reference implementation of both the conditions database service and the client which together encapsulate HSF best practice conditions data handling. Django was chosen for the service implementation, which uses an ORM instead of the direct use of SQL for all but one method. The simple relational database schema to organise conditions data is implemented in PostgreSQL. The task of storing conditions data payloads themselves is outsourced to any POSIX- compliant filesystem, allowing for transparent relocation and redundancy. Cru- cially this design provides a clear separation between retrieving the metadata describing which conditions data are needed for a data processing job, and retrieving the actual payloads from storage. The service deployment using Helm on OKD will be described together with scaling tests and operations experience from the sPHENIX experiment running more than 25k cores at BNL.
翻译:条件数据是处理事件数据所需的非事件数据子集,具有异构结构和分布式计算高访问率等独特挑战。HSF条件数据库工作组旨在促进跨实验讨论,尽可能广泛地吸纳参与者。该工作组源于HSF社区白皮书中关于条件数据访问的研究,ATLAS、Belle II和CMS的专家在此过程中形成了共同语言,并提出了代表最佳实践的数据库模式。在与包括核物理和高能物理实验在内的更广泛社区讨论后,我们定义了一套核心用例、功能和行为,旨在描述核心条件数据库API。本文将详细描述条件数据库服务及其客户端的参考实现,两者共同封装了HSF在条件数据处理方面的最佳实践。服务实现采用Django框架,除一个方法外,均使用ORM而非直接SQL操作。用于组织条件数据的简单关系数据库模式基于PostgreSQL实现。条件数据载荷本身的存储任务则外包给任意POSIX兼容文件系统,从而实现透明的数据迁移与冗余备份。这一设计的关键在于,它能清晰分离"检索描述数据处理任务所需条件数据的元数据"与"从存储中获取实际载荷"两个环节。本文将介绍利用OKD平台通过Helm部署该服务的方案,以及基于sPHENIX实验在布鲁克海文国家实验室运行超25000核的扩展测试与运维经验。