With the rapid advancement of artificial intelligence (AI), generative AI (GenAI) has emerged as a transformative tool, enabling customized and personalized AI-generated content (AIGC) services. However, GenAI models with billions of parameters require substantial memory capacity and computational power for deployment and execution, presenting significant challenges to resource-limited edge networks. In this paper, we address the joint model caching and resource allocation problem in GenAI-enabled wireless edge networks. Our objective is to balance the trade-off between delivering high-quality AIGC and minimizing the delay in AIGC service provisioning. To tackle this problem, we employ a deep deterministic policy gradient (DDPG)-based reinforcement learning approach, capable of efficiently determining optimal model caching and resource allocation decisions for AIGC services in response to user mobility and time-varying channel conditions. Numerical results demonstrate that DDPG achieves a higher model hit ratio and provides superior-quality, lower-latency AIGC services compared to other benchmark solutions.
翻译:随着人工智能(AI)的快速发展,生成式人工智能(GenAI)已成为一种变革性工具,能够提供定制化和个性化的AI生成内容(AIGC)服务。然而,具有数十亿参数的GenAI模型在部署和执行时需要巨大的内存容量和计算能力,这对资源受限的边缘网络构成了重大挑战。本文研究了GenAI赋能的无线边缘网络中的联合模型缓存与资源分配问题。我们的目标是在提供高质量AIGC服务与最小化AIGC服务供给延迟之间实现权衡优化。为解决该问题,我们采用了一种基于深度确定性策略梯度(DDPG)的强化学习方法,该方法能够根据用户移动性和时变信道条件,高效地为AIGC服务确定最优的模型缓存与资源分配决策。数值结果表明,与其他基准解决方案相比,DDPG实现了更高的模型命中率,并能提供更高质量、更低延迟的AIGC服务。