Uproot reads ROOT TTrees using pure Python. For numerical and (singly) jagged arrays, this is fast because a whole block of data can be interpreted as an array without modifying the data. For other cases, such as arrays of std::vector<std::vector<float>>, numerical data are interleaved with structure, and the only way to deserialize them is with a sequential algorithm. When written in Python, such algorithms are very slow. We solve this problem by writing the same logic in a language that can be executed quickly. AwkwardForth is a Domain Specific Language (DSL), based on Standard Forth with I/O extensions for making Awkward Arrays, and it can be interpreted as a fast virtual machine without requiring LLVM as a dependency. We generate code as late as possible to take advantage of optimization opportunities. All ROOT types previously implemented with Python have been converted to AwkwardForth. Double and triple-jagged arrays, for example, are 400x faster in AwkwardForth than in Python, with multithreaded scaling up to 1 second/GB because AwkwardForth releases the Python GIL. We also investigate the possibility of JIT-compiling the generated AwkwardForth code using LLVM to increase the performance gains. In this paper, we describe design aspects, performance studies, and future directions in accelerating Uproot with AwkwardForth.
翻译:Uproot使用纯Python读取ROOT TTree。对于数值数组和(单层)锯齿数组,由于整块数据可直接解释为数组而无需修改数据,因此速度较快。对于其他情况,如std::vector<std::vector<float>>类型的数组,数值数据与结构信息交错排列,反序列化的唯一方法是采用顺序算法。当用Python编写此类算法时,速度非常缓慢。我们通过使用快速执行的语言编写相同逻辑来解决此问题。AwkwardForth是一种领域特定语言(DSL),基于标准Forth并扩展了用于生成Awkward Array的I/O功能,它可作为快速虚拟机解释执行,无需依赖LLVM。我们尽可能晚地生成代码以充分利用优化机会。此前用Python实现的所有ROOT类型已转换为AwkwardForth。例如,双层和三层锯齿数组在AwkwardForth中的速度比Python快400倍,且由于AwkwardForth释放了Python GIL,多线程扩展可达1秒/GB。我们还研究了使用LLVM对生成的AwkwardForth代码进行即时编译以进一步提升性能的可能性。本文描述了使用AwkwardForth加速Uproot的设计方案、性能研究及未来方向。