Move\,37 marks one of the major breakthroughs in AI in terms of its ability to surpass human expertise and discover novel strategies beyond the traditional game play in the strategic two-player board game of Go. The domains of Natural Language Processing, Computer Vision, and Robotics have also undergone a similar phenomenon through the advent of large foundational models in the form of Large Language Models (LLMs), Vision Language Models (VLMs) and Vision Language Action models (VLAs), respectively. In this paper, we investigate the current state of Artificial Intelligence for Database Systems research (AI4DB), and assess how far AI4DB systems are from achieving their own Move\,37 moment. We envision a Generative Database Agent (Gen-DBA, for short) as the pathway to achieving Move\,37 for database systems that will bring generative reasoning and creativity into the realm of database learning tasks. This vision paper explores this direction by presenting the recipe for building Gen-DBA that encompasses but is not limited to a Transformer backbone, a hardware-grounded tokenization mechanism, a two-stage Goal-Directed Next Token Prediction training paradigm, and a generative inference process.
翻译:在人工智能领域,“第37步”标志着AI在策略性双人棋盘游戏围棋中超越人类专业知识、发现超越传统玩法的新颖策略的重大突破。自然语言处理、计算机视觉和机器人学领域也分别通过大型语言模型、视觉语言模型和视觉语言动作模型等大型基础模型的出现,经历了类似的现象。本文研究了人工智能赋能数据库系统研究的现状,并评估了AI4DB系统距离实现其自身的“第37步”时刻还有多远。我们设想生成式数据库智能体是实现数据库系统“第37步”的途径,它将为数据库学习任务带来生成式推理与创造性。本愿景论文通过阐述构建Gen-DBA的框架来探索这一方向,该框架涵盖但不限于:Transformer主干网络、硬件基础的令牌化机制、两阶段目标导向的下一令牌预测训练范式以及生成式推理过程。