Fifth-generation (5G) wireless systems are increasingly adopted in smart manufacturing to support heterogeneous industrial workloads through services such as enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low-Latency Communication (URLLC). However, industrial traffic is inherently process-driven and temporally correlated. So, static or reactive schedulers in the Open Radio Access Network (O-RAN) are inadequate for such non-stationary conditions, leading to sub-optimal utilization and violation of latency-reliability guarantees. This paper proposes a temporal-aware deep reinforcement learning (DRL) xApp for proactive Physical Resource Block (PRB) allocation in O-RAN-enabled industrial networks. The proposed framework integrates a long short-term memory (LSTM) encoder within a Double Deep Q-Network (DQN) to model sequential dependencies among slice-level Key Performance Indicators (KPIs), enabling predictive and stable decision-making. A continuous-time Markov chain (CTMC) traffic model is incorporated to emulate machine concurrency and process burstiness. Experimental results show that the LSTM-Double DQN improves slice satisfaction, and buffer stability under moderate and heavy load, with the longest sequence window providing the strongest gains.
翻译:第五代(5G)无线系统在智能制造中的应用日益广泛,通过增强移动宽带(eMBB)和超可靠低延迟通信(URLLC)等服务支持异构工业工作负载。然而,工业流量本质上是过程驱动且时间相关的。因此,开放无线接入网络(O-RAN)中的静态或反应式调度器无法适应此类非平稳条件,导致资源利用率次优并违反延迟可靠性保障。本文提出了一种面向O-RAN赋能工业网络中主动式物理资源块(PRB)分配的时间感知深度强化学习(DRL)xApp。所提出的框架将长短期记忆(LSTM)编码器集成到双深度Q网络(DQN)中,以建模切片级关键性能指标(KPI)之间的时序依赖性,从而实现预测性和稳定的决策制定。采用连续时间马尔可夫链(CTMC)流量模型来模拟机器并发性和过程突发性。实验结果表明,LSTM-双DQN在中度和重度负载下改善了切片满意度和缓冲区稳定性,其中最长序列窗口带来了最显著的性能提升。