In hybrid transactional and analytical processing (HTAP) systems, users often struggle to understand why query plans from one engine (OLAP or OLTP) perform significantly slower than those from another. Although optimizers provide plan details via the EXPLAIN function, these explanations are frequently too technical for non-experts and offer limited insights into performance differences across engines. To address this, we propose a novel framework that leverages large language models (LLMs) to explain query performance in HTAP systems. Built on Retrieval-Augmented Generation (RAG), our framework constructs a knowledge base that stores historical query executions and expert-curated explanations. To enable efficient retrieval of relevant knowledge, query plans are embedded using a lightweight tree-CNN classifier. This augmentation allows the LLM to generate clear, context-aware explanations of performance differences between engines. Our approach demonstrates the potential of LLMs in hybrid engine systems, paving the way for further advancements in database optimization and user support.
翻译:在混合事务与分析处理(HTAP)系统中,用户常常难以理解为何来自某一引擎(OLAP或OLTP)的查询计划比另一引擎的执行速度显著更慢。尽管优化器通过EXPLAIN功能提供计划详情,但这些解释对非专业人士而言通常过于技术化,且对跨引擎性能差异的洞察有限。为解决此问题,我们提出了一种新颖框架,利用大型语言模型(LLMs)来解释HTAP系统中的查询性能。该框架基于检索增强生成(RAG)构建,创建了一个存储历史查询执行记录和专家整理解释的知识库。为实现相关知识的有效检索,查询计划通过轻量级树状CNN分类器进行嵌入表示。这种增强机制使得LLM能够生成清晰、上下文感知的跨引擎性能差异解释。我们的方法展示了LLM在混合引擎系统中的潜力,为数据库优化与用户支持领域的进一步发展开辟了道路。