Edge intelligence enables AI inference at the network edge, co-located with or near the radio access network, rather than in centralized clouds or on mobile devices. It targets low-latency, resource-constrained applications with large data volumes, requiring tight integration of wireless access and on-site computing. Yet system performance and cost-efficiency hinge on joint pre-deployment dimensioning of radio and computational resources, especially under spatial and temporal uncertainty. Prior work largely emphasizes run-time allocation or relies on simplified models that decouple radio and computing, missing end-to-end correlations in large-scale deployments. This paper introduces a unified stochastic framework to dimension multi-cell edge-intelligent systems. We model network topology with Poisson point processes, capturing random user and base-station locations, inter-cell interference, distance-based fractional power control, and peak-power constraints. By combining this with queueing theory and empirical AI inference workload profiling, we derive tractable expressions for end-to-end offloading delay. These enable a non-convex joint optimization that minimizes deployment cost under statistical QoS guarantees, expressed through strict tail-latency and inference-accuracy constraints. We prove the problem decomposes into convex subproblems, yielding global optimality. Numerical results in noise- and interference-limited regimes identify cost-efficient design regions and configurations that cause under-utilization or user unfairness. Smaller cells reduce transmission delay but raise per-request computing cost due to weaker server multiplexing, whereas larger cells show the opposite trend. Densification reduces computational costs only when frequency reuse scales with base-station density; otherwise, sparser deployments improve fairness and efficiency in interference-limited settings.
翻译:边缘智能将人工智能推理部署于网络边缘,即与无线接入网共址或邻近的位置,而非集中式云端或移动设备。其目标在于服务具有大数据量、低延迟要求及资源受限的应用场景,这需要无线接入与现场计算的紧密集成。然而,系统性能与成本效益取决于无线和计算资源的联合预部署维度规划,尤其是在存在时空不确定性的情况下。现有研究大多侧重于运行时资源分配,或依赖于将无线与计算解耦的简化模型,未能捕捉大规模部署中的端到端关联性。本文提出了一种统一的随机框架,用于对多蜂窝边缘智能系统进行维度规划。我们采用泊松点过程对网络拓扑进行建模,以捕捉用户与基站的随机分布、小区间干扰、基于距离的部分功率控制以及峰值功率约束。通过将此模型与排队论及实证人工智能推理工作负载分析相结合,我们推导出了端到端卸载延迟的可处理表达式。基于这些表达式,我们构建了一个非凸联合优化问题,旨在满足统计服务质量保证(通过严格的尾部延迟和推理精度约束表达)的前提下最小化部署成本。我们证明了该问题可分解为凸子问题,从而获得全局最优解。在噪声受限与干扰受限机制下的数值结果揭示了成本效益较高的设计区域与配置,以及可能导致资源利用率不足或用户不公平性的配置。较小的小区虽能降低传输延迟,但由于服务器复用程度减弱,会提高单请求的计算成本;而较大的小区则呈现相反趋势。仅当频率复用随基站密度同步扩展时,网络密集化才能降低计算成本;否则,在干扰受限场景下,更稀疏的部署反而能提升公平性与效率。