With the growing demand for massive data analysis, many DBMSs have adopted complex underlying query execution mechanisms, including vectorized operators, parallel execution, and dynamic pipeline modifications. However, there remains a lack of targeted Query Performance Prediction (QPP) methods for these complex execution mechanisms and their interactions, as most existing approaches focus on traditional tree-shaped query plans and static serial executors. To address this challenge, this paper proposes CONCERTO, a Complex query executiON meChanism-awaE leaRned cosT estimatiOn method. CONCERTO first establishes independent resource cost models for each physical operator. It then constructs a Directed Acyclic Graph (DAG) consisting of a dataflow tree backbone and resource competition relationships among concurrent operators. After calibrating the cost impact of parallel operator execution using Graph Attention Networks (GATs) with additional attention mechanisms, CONCERTO extracts and aggregates cost vector trees through Temporal Convolutional Networks (TCNs), ultimately achieving effective query performance prediction. Experimental results demonstrate that CONCERTO achieves higher prediction accuracy than existing methods.
翻译:随着大规模数据分析需求的日益增长,许多数据库管理系统采用了复杂的底层查询执行机制,包括向量化算子、并行执行以及动态流水线修改。然而,目前仍然缺乏针对这些复杂执行机制及其相互作用的针对性查询性能预测方法,因为现有方法大多聚焦于传统的树形查询计划与静态串行执行器。为应对这一挑战,本文提出CONCERTO——一种面向复杂查询执行机制的感知式学习成本估计方法。CONCERTO首先为每个物理算子建立独立的资源成本模型,随后构建一个由数据流树主干与并发算子间资源竞争关系组成的有向无环图。在利用带有附加注意力机制的图注意力网络校准并行算子执行的成本影响后,CONCERTO通过时间卷积网络提取并聚合成本向量树,最终实现有效的查询性能预测。实验结果表明,CONCERTO相比现有方法具有更高的预测准确度。