In the ever-evolving landscape of scientific computing, properly supporting the modularity and complexity of modern scientific applications requires new approaches to workflow execution, like seamless interoperability between different workflow systems, distributed-by-design workflow models, and automatic optimisation of data movements. In order to address this need, this article introduces SWIRL, an intermediate representation language for scientific workflows. In contrast with other product-agnostic workflow languages, SWIRL is not designed for human interaction but to serve as a low-level compilation target for distributed workflow execution plans. The main advantages of SWIRL semantics are low-level primitives based on the send/receive programming model and a formal framework ensuring the consistency of the semantics and the specification of translating workflow models represented by Directed Acyclic Graphs (DAGs) into SWIRL workflow descriptions. Additionally, SWIRL offers rewriting rules designed to optimise execution traces, accompanied by corresponding equivalence. An open-source SWIRL compiler toolchain has been developed using the ANTLR Python3 bindings.
翻译:在科学计算领域不断演进的背景下,为有效支持现代科学应用的模块化与复杂性,需要采用新的工作流执行方法,例如不同工作流系统间的无缝互操作性、分布式设计的工作流模型以及数据移动的自动优化。为应对这一需求,本文介绍了一种面向科学工作流的中间表示语言——SWIRL。与其他与具体产品无关的工作流语言不同,SWIRL并非为人机交互而设计,而是作为分布式工作流执行计划的底层编译目标。SWIRL语义的主要优势在于:其基于发送/接收编程模型的底层原语,以及一个确保语义一致性的形式化框架,该框架规范了将有向无环图表示的工作流模型转换为SWIRL工作流描述的过程。此外,SWIRL提供了一套旨在优化执行轨迹的重写规则,并附有相应的等价性证明。基于ANTLR Python3绑定,我们开发了一套开源的SWIRL编译器工具链。