Practical MCTS-based Query Optimization: A Reproducibility Study and new MCTS algorithm for complex queries

Monte Carlo Tree Search (MCTS) has been proposed as a transformative approach to join-order optimization in database query processing, with recent frameworks such as AlphaJoin and HyperQO claiming to outperform traditional methods. However, the fact that these frameworks rely on learned cost models raises concerns related to generalizability and deployment readiness. This paper presents a comprehensive reproducibility study of these methods, revealing that they often fail to support the claimed performance gains when subjected to diverse workloads. Through an ablation study, we diagnose the root cause of this instability: while the MCTS search strategy is effective, the accompanying learned cost models suffer from severe out-of-distribution generalization errors. Addressing this, we propose a novel MCTS framework. Unlike prior methods that rely on unstable learned components, our approach utilizes the database standard internal cost model, augmented by a new Extreme UCT (Upper Confidence Bound applied to Trees) selection policy to navigate the search space more robustly. We benchmark our method against the original AlphaJoin and HyperQO, as well as industry-standard baselines including Dynamic Programming (DP) and Genetic Query Optimization (GEQO), using the well-known Join Order Benchmark (JOB) and the new JOB-Complex benchmark. The results demonstrate that our approach outperforms learned MCTS methods and achieves superiority over a SOTA query optimizer in complex join scenarios on real-world data. We release the full implementation and experimental artifacts to support further research.

翻译：蒙特卡洛树搜索（MCTS）作为一种变革性方法被提出用于数据库查询处理中的连接顺序优化，近期诸如AlphaJoin和HyperQO等框架声称其性能优于传统方法。然而，这些框架依赖学习型成本模型的事实引发了关于泛化能力和部署就绪性的担忧。本文对这些方法进行了全面的可复现性研究，发现当面对多样化工作负载时，它们往往无法支撑所宣称的性能提升。通过消融实验，我们诊断出这种不稳定性的根本原因：虽然MCTS搜索策略是有效的，但伴随的学习型成本模型存在严重的分布外泛化误差。针对此问题，我们提出了一种新颖的MCTS框架。与依赖不稳定学习组件的现有方法不同，我们的方法利用数据库标准内部成本模型，并通过一种新的极端UCT（应用于树的置信上限）选择策略进行增强，以更稳健地探索搜索空间。我们使用知名的连接顺序基准（JOB）及新的JOB-Complex基准，将我们的方法与原始AlphaJoin和HyperQO，以及包括动态规划（DP）和遗传查询优化（GEQO）在内的工业标准基线进行对比评估。结果表明，我们的方法优于基于学习的MCTS方法，并在真实数据集的复杂连接场景中超越了最先进的查询优化器。我们公开了完整的实现和实验构件以支持进一步研究。