Similar to other programming models, compilers for SYCL, the open programming model for heterogeneous computing based on C++, would benefit from access to higher-level intermediate representations. The loss of high-level structure and semantics caused by premature lowering to low-level intermediate representations and the inability to reason about host and device code simultaneously present major challenges for SYCL compilers. The MLIR compiler framework, through its dialect mechanism, allows to model domain-specific, high-level intermediate representations and provides the necessary facilities to address these challenges. This work therefore describes practical experience with the design and implementation of an MLIR-based SYCL compiler. By modeling key elements of the SYCL programming model in host and device code in the MLIR dialect framework, the presented approach enables the implementation of powerful device code optimizations as well as analyses across host and device code. Compared to two LLVM-based SYCL implementations, this yields speedups of up to 4.3x on a collection of SYCL benchmark applications. Finally, this work also discusses challenges encountered in the design and implementation and how these could be addressed in the future.
翻译:与其他编程模型类似,基于C++的异构计算开放编程模型SYCL的编译器,同样受益于对高层中间表示的访问能力。过早降级到低层中间表示所导致的高层结构与语义丢失,以及无法同时分析宿主代码与设备代码的问题,构成了SYCL编译器面临的主要挑战。MLIR编译器框架通过其方言机制,能够对领域特定的高层中间表示进行建模,并提供应对这些挑战的必要工具。因此,本文描述了基于MLIR的SYCL编译器的设计与实现实践经验。通过在MLIR方言框架中对宿主代码和设备代码中的SYCL编程模型关键要素进行建模,所提出的方法实现了强大的设备代码优化以及跨宿主代码与设备代码的分析。与两种基于LLVM的SYCL实现相比,该方法在一组SYCL基准测试应用上实现了最高4.3倍的加速。最后,本文还讨论了设计与实现过程中遇到的挑战,以及未来如何解决这些挑战。