This study introduces an innovative framework that employs large language models (LLMs) to automate the design and generation of curricula for reinforcement learning (RL). As mobile networks evolve towards the 6G era, managing their increasing complexity and dynamic nature poses significant challenges. Conventional RL approaches often suffer from slow convergence and poor generalization due to conflicting objectives and the large state and action spaces associated with mobile networks. To address these shortcomings, we introduce curriculum learning, a method that systematically exposes the RL agent to progressively challenging tasks, improving convergence and generalization. However, curriculum design typically requires extensive domain knowledge and manual human effort. Our framework mitigates this by utilizing the generative capabilities of LLMs to automate the curriculum design process, significantly reducing human effort while improving the RL agent's convergence and performance. We deploy our approach within a simulated mobile network environment and demonstrate improved RL convergence rates, generalization to unseen scenarios, and overall performance enhancements. As a case study, we consider autonomous coordination and user association in mobile networks. Our obtained results highlight the potential of combining LLM-based curriculum generation with RL for managing next-generation wireless networks, marking a significant step towards fully autonomous network operations.
翻译:本研究提出了一种创新框架,利用大型语言模型(LLMs)自动化设计和生成强化学习(RL)的训练课程。随着移动网络向6G时代演进,其日益增长的复杂性和动态特性带来了重大挑战。传统的RL方法常因目标冲突以及与移动网络相关的大规模状态和动作空间而面临收敛缓慢和泛化能力差的问题。为克服这些不足,我们引入了课程学习——一种系统性地让RL智能体逐步接触更具挑战性任务的方法,以提升收敛速度和泛化性能。然而,课程设计通常需要深厚的领域知识和大量人工投入。我们的框架通过利用LLMs的生成能力自动化课程设计流程,显著减少了人工工作量,同时改善了RL智能体的收敛性和性能。我们在模拟移动网络环境中部署了该方法,并展示了RL收敛速度的提升、对未见场景的泛化能力以及整体性能的增强。作为案例研究,我们考察了移动网络中的自主协调与用户关联问题。所获结果突显了将基于LLM的课程生成与RL相结合在管理下一代无线网络方面的潜力,标志着向完全自主网络运营迈出了重要一步。