This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by 3.1x on average on TPC-H benchmark.
翻译:本文提出谓词迁移(predicate transfer)这一新颖方法,通过预过滤表以减小连接输入规模,从而优化连接性能。谓词迁移将仅在单次连接操作中进行预过滤的布隆连接(Bloom join)推广至多表连接场景,显著提升过滤效益。该方法的灵感源于Yannakakis开创性理论成果——利用半连接对无环查询进行预过滤。谓词迁移将这一理论成果推广至任意连接图,并以布隆过滤器替代半连接,从而大幅加速处理。评估表明,在TPC-H基准测试中,谓词迁移平均性能可达布隆连接的3.1倍。