Recently using machine learning (ML) based techniques to optimize modern database management systems has attracted intensive interest from both industry and academia. With an objective to tune a specific component of a DBMS (e.g., index selection, knobs tuning), the ML-based tuning agents have shown to be able to find better configurations than experienced database administrators. However, one critical yet challenging question remains unexplored -- how to make those ML-based tuning agents work collaboratively. Existing methods do not consider the dependencies among the multiple agents, and the model used by each agent only studies the effect of changing the configurations in a single component. To tune different components for DBMS, a coordinating mechanism is needed to make the multiple agents cognizant of each other. Also, we need to decide how to allocate the limited tuning budget among the agents to maximize the performance. Such a decision is difficult to make since the distribution of the reward for each agent is unknown and non-stationary. In this paper, we study the above question and present a unified coordinating framework to efficiently utilize existing ML-based agents. First, we propose a message propagation protocol that specifies the collaboration behaviors for agents and encapsulates the global tuning messages in each agent's model. Second, we combine Thompson Sampling, a well-studied reinforcement learning algorithm with a memory buffer so that our framework can allocate budget judiciously in a non-stationary environment. Our framework defines the interfaces adapted to a broad class of ML-based tuning agents, yet simple enough for integration with existing implementations and future extensions. We show that it can effectively utilize different ML-based agents and find better configurations with 1.4~14.1X speedups on the workload execution time compared with baselines.
翻译:近年来,基于机器学习(ML)的技术优化现代数据库管理系统引发了工业界和学术界的浓厚兴趣。为调优数据库管理系统的特定组件(如索引选择、参数调优),基于ML的调优智能体已被证明能够找到比经验丰富的数据库管理员更优的配置方案。然而,一个关键且具有挑战性的问题仍未得到探索——如何使这些ML调优智能体实现协同工作。现有方法未考虑多智能体间的依赖关系,每个智能体所使用的模型仅研究单组件配置变更的影响。为调优数据库管理系统的不同组件,需要一种协调机制使多个智能体能够感知彼此的存在。此外,我们还需决定如何在各智能体间分配有限的调优预算以最大化性能。这一决策的难点在于每个智能体的奖励分布未知且是非平稳的。本文针对上述问题展开研究,提出了一种统一协调框架,以高效利用现有的ML智能体。首先,我们提出了消息传播协议,该协议规定了智能体的协作行为,并将全局调优信息封装至每个智能体的模型中。其次,我们将经过充分验证的强化学习算法汤普森采样与记忆缓冲区相结合,使框架能在非平稳环境中智能分配预算。我们的框架定义了适配广泛ML调优智能体的接口,同时保持足够的简洁性以便集成现有实现及未来扩展。实验表明,该框架能有效利用不同ML智能体,相比基线方法,在工作负载执行时间上实现了1.4~14.1倍的加速,从而找到更优的配置方案。