AI Sessions for Network-Exposed AI-as-a-Service

Cloud-based Artificial Intelligence (AI) inference is increasingly latency- and context-sensitive, yet today's AI-as-a-Service is typically consumed as an application-chosen endpoint, leaving the network to provide only best-effort transport. This decoupling prevents enforceable tail-latency guarantees, compute-aware admission control, and continuity under mobility. This paper proposes Network-Exposed AI-as-a-Service (NE-AIaaS) built around a new service primitive: the AI Session (AIS)-a contractual object that binds model identity, execution placement, transport Quality-of-Service (QoS), and consent/charging scope into a single lifecycle with explicit failure semantics. We introduce the AI Service Profile (ASP), a compact contract that expresses task modality and measurable service objectives (e.g., time-to-first-response/token, p99 latency, success probability) alongside privacy and mobility constraints. On this basis, we specify protocol-grade procedures for (i) DISCOVER (model/site discovery), (ii) AI PAGING (context-aware selection of execution anchor), (iii) two-phase PREPARE/COMMIT that atomically co-reserves compute and QoS resources, and (iv) make-before-break MIGRATION for session continuity. The design is standard-mappable to Common API Framework (CAPIF) style northbound exposure, ETSI Multi-access Edge Computing (MEC) execution substrates, 5G QoS flows for transport enforcement, and Network Data Analytics Function (NWDAF) style analytics for closed-loop paging/migration triggers.

翻译：基于云的人工智能推理对延迟和上下文日益敏感，然而当前的AI即服务通常仅作为应用选定的端点被消费，网络仅提供尽力而为的传输服务。这种解耦阻碍了可强制执行的尾部延迟保证、计算感知的准入控制以及移动性下的连续性。本文提出围绕一种新型服务原语构建的面向网络开放的AI即服务：AI会话——一种将模型身份、执行放置、传输服务质量以及许可/计费范围绑定到具有明确故障语义的单一生命周期中的契约对象。我们引入AI服务配置文件，这是一种紧凑的契约，用于表达任务模态和可度量的服务目标（例如，首次响应/令牌时间、p99延迟、成功概率）以及隐私和移动性约束。在此基础上，我们规定了协议级的过程，包括：（i）DISCOVER（模型/站点发现），（ii）AI PAGING（上下文感知的执行锚点选择），（iii）用于原子化协同预留计算和QoS资源的两阶段PREPARE/COMMIT，以及（iv）用于会话连续性的先建后断式MIGRATION。该设计可标准映射到通用API框架风格的北向开放接口、ETSI多接入边缘计算执行基座、用于传输强制的5G QoS流，以及用于闭环分页/迁移触发的网络数据分析功能风格的分析系统。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

智能体网络：用AI智能体编织下一代网络

专知会员服务

31+阅读 · 2025年8月5日

基于脉冲神经网络的边缘智能

专知会员服务

21+阅读 · 2025年7月23日

《人工智能暗战：SaaS与边缘计算架构之争》

专知会员服务

15+阅读 · 2025年7月23日

AI应用正当时，详解AI应用开发新范式

专知会员服务

28+阅读 · 2025年7月10日