For efficient query processing, DBMS query optimizers have for decades relied on delicate cardinality estimation methods. In this work, we propose an Attention-based LEarned Cardinality Estimator (ALECE for short) for SPJ queries. The core idea is to discover the implicit relationships between queries and underlying dynamic data using attention mechanisms in ALECE's two modules that are built on top of carefully designed featurizations for data and queries. In particular, from all attributes in the database, the data-encoder module obtains organic and learnable aggregations which implicitly represent correlations among the attributes, whereas the query-analyzer module builds a bridge between the query featurizations and the data aggregations to predict the query's cardinality. We experimentally evaluate ALECE on multiple dynamic workloads. The results show that ALECE enables PostgreSQL's optimizer to achieve nearly optimal performance, clearly outperforming its built-in cardinality estimator and other alternatives.
翻译:对于高效的查询处理,数据库管理系统查询优化器数十年来一直依赖精细的基数估计方法。本文提出了一种基于注意力的学习型基数估计器(简称ALECE),用于SPJ查询。其核心思想是利用ALECE中基于精心设计的数据和查询特征化构建的两个模块中的注意力机制,发现查询与底层动态数据之间的隐式关联。具体而言,数据编码器从数据库的所有属性中获取有机且可学习的聚合表示,这些表示隐式地反映了属性间的相关性;而查询分析器则在查询特征化与数据聚合表示之间建立桥梁,以预测查询的基数。我们在多个动态工作负载上对ALECE进行了实验评估。结果表明,ALECE使PostgreSQL优化器能够实现接近最优的性能,显著优于其内置的基数估计器及其他替代方案。