As generative AI capabilities expand, AI-driven virtual worlds face a growing architectural challenge. Users interact through in-world interfaces in multimodal ways, yet their requests demand fundamentally different AI backend models and computational resources. Embedding these capabilities directly into virtual world systems reduces extensibility, complicates maintenance, and limits the ability to coordinate services distributed across edge and cloud infrastructure. This paper presents an SLM-based Agent Orchestration Gateway, a lightweight runtime coordination mechanism that decouples a virtual world client from heterogeneous AI backends through intent-driven service routing. An edge-deployed SLM classifies the semantic intent of each user prompt, a configurable service registry validates and resolves the routing decision, and the selected backend is invoked transparently, enabling new AI capabilities to be introduced in the virtual world without modifying the client application. The gateway is implemented and evaluated within the InterwovenXR virtual museum testbed. The evaluation shows that compact SLMs can serve as reliable intent routers on edge hardware, and that task-specific fine-tuning can transform sub-billion-parameter models into practical, low-latency routers. A layered configuration pairing a fine-tuned sub billion-parameter model as router with a larger SLM for conversational response generation is shown to be deployable on mid-range edge hardware and more efficient than delegating both responsibilities to a single model. The findings show that SLMs can support practical AI service orchestration in virtual worlds and the work contributes an evaluated architecture for scalable, extensible, and edge-supported AI interaction, enabling virtual agents become access points to distributed generative AI services.
翻译:随着生成式AI能力的扩展,AI驱动的虚拟世界面临日益增长的架构挑战。用户通过世界内界面以多模态方式进行交互,但其请求需要本质上不同的AI后端模型和计算资源。将这些能力直接嵌入虚拟世界系统会降低可扩展性、增加维护复杂性,并限制协调分布在边缘与云基础设施上的服务能力。本文提出一种基于SLM的智能体编排网关,这是一种轻量级运行时协调机制,通过意图驱动的服务路由将虚拟世界客户端与异构AI后端解耦。边缘部署的SLM对每个用户提示词的语义意图进行分类,可配置的服务注册表验证并解析路由决策,所选后端被透明地调用,从而无需修改客户端应用即可在虚拟世界中引入新的AI能力。该网关在InterwovenXR虚拟博物馆测试平台上实现并评估。评估表明,紧凑型SLM可在边缘硬件上作为可靠的意图路由器运行,而任务特定的微调能将子十亿参数模型转化为实用的低延迟路由器。一种分层配置方案——将微调后的子十亿参数模型作为路由器,配合更大的SLM进行对话响应生成——被证明可部署在中端边缘硬件上,且比将两项职责都委托给单一模型更高效。研究结果表明,SLM能够支持虚拟世界中实用的AI服务编排,本工作贡献了一种经验证的、可扩展、可扩展且支持边缘计算的AI交互架构,使虚拟智能体成为分布式生成式AI服务的访问入口。