In modern data-streaming systems, alongside traditional programs, a new type of entity has emerged that can interact with streaming data: AI agents. Unlike traditional programs, AI agents use LLM reasoning to accomplish high-level tasks specified in natural language over streaming data. Unfortunately, current streaming systems cannot fully support agents: they lack the fundamental mechanisms to avoid the performance interference caused by agentic tasks and to safely handle agentic writes. We argue that the shared log, the core abstraction underlying streaming data, must support creating forks of itself, and that such a forkable shared log serves as a great substrate for agents acting on streaming data. We propose AgileLog, a new shared log abstraction that provides novel forking primitives for agentic use cases. We design Bolt, an implementation of the AgileLog abstraction, that uses novel techniques to make forks cheap, and provide logical and performance isolation.
翻译:在现代数据流系统中,除了传统程序外,一种能交互流数据的新实体已然出现:AI智能体。与传统程序不同,AI智能体借助大语言模型推理能力,在流数据上完成以自然语言指定的高层级任务。然而,当前流处理系统无法充分支持智能体:既缺乏避免智能体任务引发性能干扰的基础机制,也无法安全处理智能体写入操作。我们认为,作为流数据核心抽象的共享日志必须支持自我分支创建,此类可分支共享日志可成为作用于流数据的智能体的优良基础。我们提出AgileLog——一种新的共享日志抽象,为智能体用例提供创新的分支原语;并设计Bolt——AgileLog抽象的实现方案,通过创新技术实现低成本分支,同时提供逻辑隔离与性能隔离。