Edge computing enables AI inference closer to data sources, reducing latency and bandwidth costs. However, orchestrating AI services across the cloud-edge continuum remains challenging due to dynamic workloads and infrastructure variability. We present AIF-Router, an Active Inference--based routing framework that autonomously learns to balance latency, throughput, and resource utilization across multi-tier AI services without offline training. AIF-Router performs Bayesian state inference and expected free energy minimization to guide routing decisions based on observability-driven real-time metrics. Despite device instability on edge nodes, AIF-Router exhibits stable online learning behavior and demonstrates the feasibility of applying Active Inference for adaptive AI service orchestration in unreliable edge environments. Our findings highlight both the promise and practical challenges of deploying self-adaptive decision-making frameworks for real-world edge AI systems.
翻译:边缘计算能够在数据源附近执行AI推理,从而降低延迟和带宽成本。然而,由于工作负载的动态变化和基础设施的异构性,在云-边缘连续体中编排AI服务仍具挑战。我们提出AIF-Router——一种基于主动推断的路由框架,能够在无离线训练的情况下自主学习平衡多层级AI服务的延迟、吞吐量和资源利用率。AIF-Router通过执行贝叶斯状态推断和期望自由能最小化,基于可观测驱动的实时指标引导路由决策。尽管边缘节点存在设备不稳定性,AIF-Router仍展现出稳定的在线学习行为,并证明了在不可靠边缘环境中应用主动推断实现自适应AI服务编排的可行性。我们的研究结果揭示了在现实边缘AI系统中部署自适应决策框架的前景与实际挑战。