CERT: Finding Performance Issues in Database Systems Through the Lens of Cardinality Estimation

Database Management Systems (DBMSs) process a given query by creating a query plan, which is subsequently executed, to compute the query's result. Deriving an efficient query plan is challenging, and both academia and industry have invested decades into researching query optimization. Despite this, DBMSs are prone to performance issues, where a DBMS produces an unexpectedly inefficient query plan that might lead to the slow execution of a query. Finding such issues is a longstanding problem and inherently difficult, because no ground truth information on an expected execution time exists. In this work, we propose Cardinality Estimation Restriction Testing (CERT), a novel technique that finds performance issues through the lens of cardinality estimation. Given a query on a database, CERT derives a more restrictive query (e.g., by replacing a LEFT JOIN with an INNER JOIN), whose estimated number of rows should not exceed the estimated number of rows for the original query. CERT tests cardinality estimation specifically, because they were shown to be the most important part for query optimization; thus, we expect that finding and fixing such issues might result in the highest performance gains. In addition, we found that other kinds of query optimization issues can be exposed by unexpected estimated cardinalities, which can also be found by CERT. CERT is a black-box technique that does not require access to the source code; DBMSs expose query plans via the EXPLAIN statement. CERT eschews executing queries, which is costly and prone to performance fluctuations. We evaluated CERT on three widely used and mature DBMSs, MySQL, TiDB, and CockroachDB. CERT found 13 unique issues, of which 2 issues were fixed and 9 confirmed by the developers. We expect that this new angle on finding performance bugs will help DBMS developers in improving DMBSs' performance.

翻译：数据库管理系统（DBMS）通过创建查询计划并执行该计划来处理给定查询，以计算查询结果。设计高效的查询计划极具挑战性，学术界和工业界已为此投入数十年研究查询优化。尽管如此，DBMS仍容易出现性能问题——系统可能生成异常低效的查询计划，导致查询执行缓慢。发现此类问题是长期存在的难题，且本质上十分困难，因为查询的执行时间不存在预期标准答案。本文提出基数估计约束测试（CERT）这一新技术，通过基数估计的视角发现性能问题。对于给定数据库上的查询，CERT会推导出一个更严格的查询（例如将LEFT JOIN替换为INNER JOIN），该查询的预估行数不应超过原始查询的预估行数。CERT专门测试基数估计，因为研究证明这是查询优化中最重要的环节；因此，我们预期发现并修复此类问题可能带来最高的性能提升。此外，我们发现其他类型的查询优化问题也可能通过异常的预估基数暴露，而CERT同样能够发现这些问题。CERT是一种黑盒技术，无需访问源代码即可运行；DBMS通过EXPLAIN语句暴露查询计划。CERT避免实际执行查询，因为执行成本高昂且易受性能波动影响。我们在三个广泛使用且成熟的开源DBMS（MySQL、TiDB和CockroachDB）上评估了CERT。共发现13个独特问题，其中2个已被修复，9个获得开发者确认。我们预期这种发现性能缺陷的新视角将有助于DBMS开发者改善系统性能。