The large-scale deployment of 5G networks has not delivered the expected return on investment for mobile network operators, raising concerns about the economic viability of future 6G rollouts. At the same time, surging demand for Artificial Intelligence (AI) inference and training workloads is straining global compute capacity. AI-RAN architectures, in which Radio Access Network (RAN) platforms accelerated on Graphics Processing Unit (GPU) share idle capacity with AI workloads during off-peak periods, offer a potential path to improved capital efficiency. However, the economic case for such systems remains unsubstantiated. In this paper, we present a techno-economic analysis of AI-RAN deployments by combining publicly available benchmarks of 5G Layer-1 processing on heterogeneous platforms -- from x86 servers with accelerators for channel coding to modern GPUs -- with realistic traffic models and AI service demand profiles for Large Language Model (LLM) inference. We construct a joint cost and revenue model that quantifies the surplus compute capacity available in GPU-based RAN deployments and evaluates the returns from leasing it to AI tenants. Our results show that, across a range of scenarios encompassing token depreciation, varying demand dynamics, and diverse GPU serving densities, the additional capital and operational expenditures of GPU-heavy deployments are offset by AI-on-RAN revenue, yielding a return on investment of up to 8x. These findings strengthen the long-term economic case for accelerator-based RAN architectures and future 6G deployments.
翻译:5G网络的大规模部署并未给移动网络运营商带来预期的投资回报,这引发了对未来6G部署经济可行性的担忧。与此同时,人工智能推理与训练工作负载的激增正对全球计算能力造成压力。AI-RAN架构中,基于图形处理单元加速的无线接入网平台可在非高峰时段将闲置计算能力与AI工作负载共享,为提升资本效率提供了潜在路径。然而,此类系统的经济合理性尚未得到证实。本文通过将异构平台上5G Layer-1处理的公开基准测试——涵盖从配备信道编码加速器的x86服务器到现代GPU——与真实流量模型及大语言模型推理的AI服务需求特征相结合,对AI-RAN部署进行了技术经济分析。我们构建了联合成本与收益模型,量化了基于GPU的RAN部署中可用的剩余计算能力,并评估了将其租赁给AI租户的回报。研究表明,在涵盖token贬值、动态需求变化及不同GPU服务密度的一系列场景中,GPU密集型部署增加的资本支出与运营支出可通过AI-on-RAN收益得到补偿,最终实现高达8倍的投资回报率。这些发现强化了基于加速器的RAN架构及未来6G部署的长期经济合理性。