We present a provenance model for the generic workflow of numerical Lattice Quantum Chromodynamics (QCD) calculations, which constitute an important component of particle physics research. These calculations are carried out on the largest supercomputers worldwide with data in the multi-PetaByte range being generated and analyzed. In the Lattice QCD community, a custom metadata standard (QCDml) that includes certain provenance information already exists for one part of the workflow, the so-called generation of configurations. In this paper, we follow the W3C PROV standard and formulate a provenance model that includes both the generation part and the so-called measurement part of the Lattice QCD workflow. We demonstrate the applicability of this model and show how the model can be used to answer some provenance-related research questions. However, many important provenance questions in the Lattice QCD community require extensions of this provenance model. To this end, we propose a multi-layered provenance approach that combines prospective and retrospective elements.
翻译:我们为数值格点量子色动力学(Lattice QCD)计算通用工作流提出了一种起源模型,该计算是粒子物理研究的重要组成部分。这些计算在全球最大的超级计算机上进行,生成和分析的数据量达数拍字节范围。在格点QCD领域,已存在针对工作流中所谓"构型生成"环节的自定义元数据标准(QCDml),其中包含部分起源信息。本文遵循W3C PROV标准,构建了涵盖格点QCD工作流中生成环节与测量环节的起源模型。我们验证了该模型的适用性,并展示了如何利用该模型回答部分与起源相关的研究问题。然而,格点QCD领域许多重要的起源问题需要对该模型进行扩展。为此,我们提出了一种结合前瞻性与回溯性要素的多层起源方法。