Stream processing engines (SPEs) are widely used for large scale streaming analytics over unbounded time-ordered data streams. Modern day streaming analytics applications exhibit diverse compute characteristics and demand strict latency and throughput requirements. Over the years, there has been significant attention in building hardware-efficient stream processing engines (SPEs) that support several query optimization, parallelization, and execution strategies to meet the performance requirements of large scale streaming analytics applications. However, in this work, we observe that these strategies often fail to generalize well on many real-world streaming analytics applications due to several inherent design limitations of current SPEs. We further argue that these limitations stem from the shortcomings of the fundamental design choices and the query representation model followed in modern SPEs. To address these challenges, we first propose TiLT, a novel intermediate representation (IR) that offers a highly expressive temporal query language amenable to effective query optimization and parallelization strategies. We subsequently build a compiler backend for TiLT that applies such optimizations on streaming queries and generates hardware-efficient code to achieve high performance on multi-core stream query executions. We demonstrate that TiLT achieves up to 326x (20.49x on average) higher throughput compared to state-of-the-art SPEs (e.g., Trill) across eight real-world streaming analytics applications. TiLT source code is available at https://github.com/ampersand-projects/tilt.git.
翻译:流处理引擎(SPE)广泛用于对无界时间有序数据流进行大规模流式分析。现代流式分析应用展现出多样化的计算特征,并对延迟和吞吐量有着严格的要求。多年来,研究者们一直致力于构建硬件高效的流处理引擎(SPE),通过支持多种查询优化、并行化和执行策略,以满足大规模流式分析应用的性能需求。然而,在本工作中,我们发现由于现有SPE的若干固有设计局限性,这些策略往往难以在众多真实世界的流式分析应用中良好泛化。我们进一步指出,这些局限性源于现代SPE中基础设计选择及查询表示模型的缺陷。为解决上述挑战,本文首先提出TiLT —— 一种新型中间表示(IR),它提供了高度表达力的时态查询语言,能够有效支持查询优化与并行化策略。我们随后构建了针对TiLT的编译器后端,该后端对流查询应用上述优化,并生成硬件高效代码,以在多核流查询执行中实现高性能。实验表明,在八个真实世界流式分析应用中,TiLT相比最先进的SPE(如Trill)实现了高达326倍(平均20.49倍)的吞吐量提升。TiLT源代码可从https://github.com/ampersand-projects/tilt.git获取。