CERT: Finding Performance Issues in Database Systems Through the Lens of Cardinality Estimation

Database Management Systems (DBMSs) process a given query by creating a query plan, which is subsequently executed, to compute the query's result. Deriving an efficient query plan is challenging, and both academia and industry have invested decades into researching query optimization. Despite this, DBMSs are prone to performance issues, where a DBMS produces an unexpectedly inefficient query plan that might lead to the slow execution of a query. Finding such issues is a longstanding problem and inherently difficult, because no ground truth information on an expected execution time exists. In this work, we propose Cardinality Estimation Restriction Testing (CERT), a novel technique that finds performance issues through the lens of cardinality estimation. Given a query on a database, CERT derives a more restrictive query (e.g., by replacing a LEFT JOIN with an INNER JOIN), whose estimated number of rows should not exceed the estimated number of rows for the original query. CERT tests cardinality estimation specifically, because they were shown to be the most important part for query optimization; thus, we expect that finding and fixing such issues might result in the highest performance gains. In addition, we found that other kinds of query optimization issues can be exposed by unexpected estimated cardinalities, which can also be found by CERT. CERT is a black-box technique that does not require access to the source code; DBMSs expose query plans via the EXPLAIN statement. CERT eschews executing queries, which is costly and prone to performance fluctuations. We evaluated CERT on three widely used and mature DBMSs, MySQL, TiDB, and CockroachDB. CERT found 13 unique issues, of which 2 issues were fixed and 9 confirmed by the developers. We expect that this new angle on finding performance bugs will help DBMS developers in improving DMBSs' performance.

翻译：摘要：数据库管理系统（DBMS）通过生成查询计划来处理给定查询，随后执行该计划以计算查询结果。推导高效的查询计划极具挑战性，学术界和工业界已投入数十年时间研究查询优化。尽管如此，DBMS仍易出现性能问题——系统可能生成异常低效的查询计划，导致查询执行缓慢。发现此类问题是一个长期存在的难题，其本质困难在于缺少关于预期执行时间的真实基准信息。本文提出基数估计限制测试（CERT），这是一种通过基数估计视角发现性能问题的新技术。对于数据库上的给定查询，CERT推导出一个更具限制性的查询（例如，将LEFT JOIN替换为INNER JOIN），其估计行数不应超过原始查询的估计行数。CERT专门针对基数估计进行测试，因为已证明它是查询优化中最关键的部分；因此，我们预期发现并修复此类问题可能带来最大的性能提升。此外，我们发现其他类型的查询优化问题也可能通过异常的估计基数暴露出来，而CERT同样能发现这些问题。CERT是一种黑盒技术，无需访问源代码；DBMS通过EXPLAIN语句暴露查询计划。CERT避免执行查询——这既成本高昂又易受性能波动影响。我们在三个广泛使用且成熟的DBMS（MySQL、TiDB和CockroachDB）上评估了CERT。CERT共发现13个独特问题，其中2个已被修复，9个获得开发者确认。我们期望这种发现性能缺陷的新角度能帮助DBMS开发者提升系统性能。