This article investigates the adaptive resource allocation scheme for digital twin (DT) synchronization optimization over dynamic wireless networks. In our considered model, a base station (BS) continuously collects factory physical object state data from wireless devices to build a real-time virtual DT system for factory event analysis. Due to continuous data transmission, maintaining DT synchronization must use extensive wireless resources. To address this issue, a subset of devices is selected to transmit their sensing data, and resource block (RB) allocation is optimized. This problem is formulated as a constrained Markov process (CMDP) problem that minimizes the long-term mismatch between the physical and virtual systems. To solve this CMDP, we first transform the problem into a dual problem that refines RB constraint impacts on device scheduling strategies. We then propose a continual reinforcement learning (CRL) algorithm to solve the dual problem. The CRL algorithm learns a stable policy across historical experiences for quick adaptation to dynamics in physical states and network capacity. Simulation results show that the CRL can adapt quickly to network capacity changes and reduce normalized root mean square error (NRMSE) between physical and virtual states by up to 55.2%, using the same RB number as traditional methods.
翻译:本文研究了动态无线网络中数字孪生(DT)同步优化的自适应资源分配方案。在所考虑的模型中,基站(BS)持续从无线设备收集工厂物理对象状态数据,以构建用于工厂事件分析的实时虚拟DT系统。由于数据持续传输,维持DT同步必须消耗大量无线资源。为解决此问题,我们选择部分设备传输其传感数据,并优化资源块(RB)分配。该问题被建模为一个约束马尔可夫决策过程(CMDP)问题,旨在最小化物理系统与虚拟系统之间的长期失配。为求解此CMDP,我们首先将问题转化为对偶问题,以细化RB约束对设备调度策略的影响。随后,我们提出一种持续强化学习(CRL)算法来求解该对偶问题。该CRL算法能够从历史经验中学习稳定策略,从而快速适应物理状态与网络容量的动态变化。仿真结果表明,在与传统方法使用相同RB数量的情况下,CRL算法能快速适应网络容量变化,并将物理状态与虚拟状态之间的归一化均方根误差(NRMSE)降低高达55.2%。