As foundation models (FMs) play an increasingly prominent role in complex software systems, such as FM-powered agentic software (i.e., Agentware), they introduce significant challenges for developers regarding observability. Unlike traditional software, agents operate autonomously, using extensive data and opaque implicit reasoning, making it difficult to observe and understand their behavior during runtime, especially when they take unexpected actions or encounter errors. In this paper, we highlight the limitations of traditional operational observability in the context of FM-powered software, and introduce cognitive observability as a new type of required observability that has emerged for such innovative systems. We then propose a novel framework that provides cognitive observability into the implicit reasoning processes of agents (a.k.a. reasoning observability), and demonstrate the effectiveness of our framework in boosting the debuggability of Agentware and, in turn, the abilities of an Agentware through a case study on AutoCodeRover, a cuttingedge Agentware for autonomous program improvement.
翻译:随着基础模型在复杂软件系统(例如基础模型驱动的智能体软件,即Agentware)中扮演日益重要的角色,它们为开发者带来了显著的可观测性挑战。与传统软件不同,智能体自主运行,利用海量数据和不透明的隐式推理,这使得在运行时观察和理解其行为变得困难,尤其是在智能体采取意外行动或遇到错误时。本文首先指出了传统运行时可观测性在基础模型驱动软件背景下的局限性,进而引入认知可观测性作为此类创新系统所需的一种新型可观测性。随后,我们提出了一种新颖的框架,该框架能够提供对智能体隐式推理过程(即推理可观测性)的认知可观测性。我们通过对AutoCodeRover(一种用于自主程序改进的前沿Agentware)的案例研究,展示了该框架在提升Agentware可调试性以及进而增强Agentware能力方面的有效性。