From Prompt to Service: An SLM-Based Agent Orchestration Gateway for AI-Driven Virtual Worlds

As generative AI capabilities expand, AI-driven virtual worlds face a growing architectural challenge. Users interact through in-world interfaces in multimodal ways, yet their requests demand fundamentally different AI backend models and computational resources. Embedding these capabilities directly into virtual world systems reduces extensibility, complicates maintenance, and limits the ability to coordinate services distributed across edge and cloud infrastructure. This paper presents an SLM-based Agent Orchestration Gateway, a lightweight runtime coordination mechanism that decouples a virtual world client from heterogeneous AI backends through intent-driven service routing. An edge-deployed SLM classifies the semantic intent of each user prompt, a configurable service registry validates and resolves the routing decision, and the selected backend is invoked transparently, enabling new AI capabilities to be introduced in the virtual world without modifying the client application. The gateway is implemented and evaluated within the InterwovenXR virtual museum testbed. The evaluation shows that compact SLMs can serve as reliable intent routers on edge hardware, and that task-specific fine-tuning can transform sub-billion-parameter models into practical, low-latency routers. A layered configuration pairing a fine-tuned sub billion-parameter model as router with a larger SLM for conversational response generation is shown to be deployable on mid-range edge hardware and more efficient than delegating both responsibilities to a single model. The findings show that SLMs can support practical AI service orchestration in virtual worlds and the work contributes an evaluated architecture for scalable, extensible, and edge-supported AI interaction, enabling virtual agents become access points to distributed generative AI services.

翻译：随着生成式AI能力的扩展，AI驱动的虚拟世界面临日益增长的架构挑战。用户通过世界内界面以多模态方式进行交互，但其请求需要本质上不同的AI后端模型和计算资源。将这些能力直接嵌入虚拟世界系统会降低可扩展性、增加维护复杂性，并限制协调分布在边缘与云基础设施上的服务能力。本文提出一种基于SLM的智能体编排网关，这是一种轻量级运行时协调机制，通过意图驱动的服务路由将虚拟世界客户端与异构AI后端解耦。边缘部署的SLM对每个用户提示词的语义意图进行分类，可配置的服务注册表验证并解析路由决策，所选后端被透明地调用，从而无需修改客户端应用即可在虚拟世界中引入新的AI能力。该网关在InterwovenXR虚拟博物馆测试平台上实现并评估。评估表明，紧凑型SLM可在边缘硬件上作为可靠的意图路由器运行，而任务特定的微调能将子十亿参数模型转化为实用的低延迟路由器。一种分层配置方案——将微调后的子十亿参数模型作为路由器，配合更大的SLM进行对话响应生成——被证明可部署在中端边缘硬件上，且比将两项职责都委托给单一模型更高效。研究结果表明，SLM能够支持虚拟世界中实用的AI服务编排，本工作贡献了一种经验证的、可扩展、可扩展且支持边缘计算的AI交互架构，使虚拟智能体成为分布式生成式AI服务的访问入口。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

25+阅读 · 3月8日

保护网络物理系统中的 AI 智能体：关于环境交互、深度伪造威胁及其防御技术的综述

专知会员服务

10+阅读 · 2月15日

智能体网络：用AI智能体编织下一代网络

专知会员服务

31+阅读 · 2025年8月5日

基于脉冲神经网络的边缘智能

专知会员服务

21+阅读 · 2025年7月23日