Performance-critical industrial applications, including large-scale program, network, and distributed system analyses, are increasingly reliant on recursive queries for data analysis. Yet traditional relational algebra-based query optimization techniques do not scale well to recursive query processing due to the iterative nature of query evaluation, where relation cardinalities can change unpredictably during the course of a single query execution. To avoid error-prone cardinality estimation, adaptive query processing techniques use runtime information to inform query optimization, but these systems are not optimized for the specific needs of recursive query processing. In this paper, we introduce Adaptive Metaprogramming, an innovative technique that shifts recursive query optimization and code generation from compile-time to runtime using principled metaprogramming, enabling dynamic optimization and re-optimization before and after query execution has begun. We present a custom join-ordering optimization applicable at multiple stages during query compilation and execution. Through Carac, we evaluate the optimization potential of Adaptive Metaprogramming and show unoptimized recursive query execution time can be improved by three orders of magnitude and hand-optimized queries by 4x.
翻译:性能关键型工业应用,包括大规模程序、网络和分布式系统分析,日益依赖递归查询进行数据分析。然而,由于查询评估的迭代特性——单次查询执行过程中关系基数可能不可预测地变化——传统基于关系代数的查询优化技术难以有效扩展至递归查询处理。为避免易出错的数据基数估计,自适应查询处理技术利用运行时信息指导查询优化,但这些系统并未针对递归查询处理的特定需求进行优化。本文提出自适应元编程(Adaptive Metaprogramming)这一创新技术,通过原则性元编程将递归查询优化与代码生成从编译时转移到运行时,从而在查询执行前后实现动态优化与重优化。我们提出一种自定义连接顺序优化方法,可应用于查询编译与执行的多个阶段。通过Carac系统,我们评估了自适应元编程的优化潜力,结果表明未优化的递归查询执行时间可提升三个数量级,而手工优化的查询可提升4倍。