We introduce QueryGym, an interactive environment for building, testing, and evaluating LLM-based query planning agents. Existing frameworks often tie agents to specific query language dialects or obscure their reasoning; QueryGym instead requires agents to construct explicit sequences of relational algebra operations, ensuring engine-agnostic evaluation and transparent step-by-step planning. The environment is implemented as a Gymnasium interface that supplies observations -- including schema details, intermediate results, and execution feedback -- and receives actions that represent database exploration (e.g., previewing tables, sampling column values, retrieving unique values) as well as relational algebra operations (e.g., filter, project, join). We detail the motivation and the design of the environment. In the demo, we showcase the utility of the environment by contrasting it with contemporary LLMs that query databases. QueryGym serves as a practical testbed for research in error remediation, transparency, and reinforcement learning for query generation. For the associated demo, see https://ibm.biz/QueryGym.
翻译:本文介绍QueryGym——一个用于构建、测试和评估基于LLM的查询规划智能体的交互式环境。现有框架常将智能体绑定于特定查询语言方言或掩盖其推理过程;而QueryGym要求智能体构建显式的关系代数操作序列,确保引擎无关的评估与透明的逐步规划。该环境以Gymnasium接口形式实现,可提供观察信息(包括模式细节、中间结果和执行反馈),并接收代表数据库探索(如表预览、列值采样、唯一值检索)及关系代数操作(如筛选、投影、连接)的动作指令。我们将详细阐述该环境的设计动机与架构。在演示中,通过对比当前直接查询数据库的LLM,我们展示了该环境的实用价值。QueryGym可作为查询生成领域的错误修复、可解释性与强化学习研究的实用测试平台。相关演示请访问 https://ibm.biz/QueryGym。