Agentic systems, in which diverse agents cooperate to tackle challenging problems, are exploding in popularity in the AI community. However, existing agentic frameworks take a relatively narrow view of agents, apply a centralized model, and target conversational, cloud-native applications (e.g., LLM-based AI chatbots). In contrast, scientific applications require myriad agents be deployed and managed across diverse cyberinfrastructure. Here we introduce Academy, a modular and extensible middleware designed to deploy autonomous agents across the federated research ecosystem, including HPC systems, experimental facilities, and data repositories. To meet the demands of scientific computing, Academy supports asynchronous execution, heterogeneous resources, high-throughput data flows, and dynamic resource availability. It provides abstractions for expressing stateful agents, managing inter-agent coordination, and integrating computation with experimental control. We present microbenchmark results that demonstrate high performance and scalability in HPC environments. To explore the breadth of applications that can be supported by agentic workflow designs, we also present case studies in materials discovery, astronomy, decentralized learning, and information extraction in which agents are deployed across diverse HPC systems.
翻译:智能体系统,即多种智能体协同解决复杂问题的体系,在人工智能领域正迅速普及。然而,现有的智能体框架对智能体的定义较为局限,采用集中式模型,且主要面向对话式、云原生应用(例如基于大语言模型的AI聊天机器人)。相比之下,科学应用需要将大量智能体部署并管理于多样化的网络基础设施之上。本文介绍Academy,一个模块化、可扩展的中间件,旨在将自主智能体部署于联邦研究生态系统中,包括高性能计算系统、实验设施和数据存储库。为满足科学计算的需求,Academy支持异步执行、异构资源、高吞吐量数据流以及动态资源可用性。它提供了用于表达有状态智能体、管理智能体间协调以及将计算与实验控制相集成的抽象层。我们展示了微基准测试结果,证明了其在高性能计算环境中的高性能与可扩展性。为探索智能体工作流设计所能支持的广泛应用范围,我们还介绍了在材料发现、天文学、去中心化学习和信息提取等领域的案例研究,其中智能体被部署于多种高性能计算系统之上。