Compilers for general-purpose languages have been shown to be at a disadvantage when it comes to specialized application domains as opposed to their Domain-Specific Language (DSL) counterparts. However, the field of DSL compilers features little consolidation in terms of compiler frameworks and adjacent software ecosystems. As a result, considerable work is duplicated, lost to maintenance issues, or remains undiscovered, and most DSLs are never considered "production-ready". One notable development is the introduction of the Multi-Level Intermediate Representation (MLIR), which promises a similar impact on DSL compilers as LLVM had on general-purpose tooling. In this work, we present a NumPy-like DSL made for offloading numeric tensor kernels that is entirely MLIR-native. In a first for open-source, it implements all frontend actions and semantic analyses directly within MLIR. Most notably, this is made possible by our new dialect-agnostic MLIR type checker, created for the future of DSLs in MLIR. We implement a simple, yet effective, parallel-first lowering scheme that connects our language to another MLIR dataflow dialect for seamless offloading. We show that our approach performs well in real-world use cases from the domain of weather modeling and Computational Fluid Dynamics (CFD) in Fortran.
翻译:通用语言编译器在特定应用领域相较于领域特定语言(DSL)编译器存在劣势。然而,DSL编译器领域在编译器框架及相邻软件生态系统方面缺乏整合。因此,大量工作被重复开发、因维护问题而流失或未被发现,且多数DSL从未达到"生产就绪"状态。一个值得注意的进展是多级中间表示(MLIR)的引入,其有望对DSL编译器产生类似LLVM对通用工具链的影响。本研究提出一个类NumPy的DSL,专为数值张量核卸载设计,完全基于MLIR原生实现。作为开源领域的首次尝试,该DSL的所有前端操作与语义分析均在MLIR内部直接完成。尤为重要的是,这得益于我们为MLIR中DSL的未来发展而创建的新颖的方言无关型MLIR类型检查器。我们实现了一种简单而有效的并行优先降级方案,将我们的语言连接到另一个MLIR数据流方言,实现无缝卸载。实验表明,本方法在来自气象建模与Fortran计算流体动力学(CFD)领域的实际案例中表现优异。