The recent interweaving of AI-6G technologies has sparked extensive research interest in further enhancing reliable and timely communications. \emph{Age of Information} (AoI), as a novel and integrated metric implying the intricate trade-offs among reliability, latency, and update frequency, has been well-researched since its conception. This paper contributes new results in this area by employing a Deep Reinforcement Learning (DRL) approach to intelligently decide how to allocate power resources and when to retransmit in a \emph{freshness-sensitive} downlink multi-user Hybrid Automatic Repeat reQuest with Chase Combining (HARQ-CC) aided Non-Orthogonal Multiple Access (NOMA) network. Specifically, an AoI minimization problem is formulated as a Markov Decision Process (MDP) problem. Then, to achieve deterministic, age-optimal, and intelligent power allocations and retransmission decisions, the Double-Dueling-Deep Q Network (DQN) is adopted. Furthermore, a more flexible retransmission scheme, referred to as Retransmit-At-Will scheme, is proposed to further facilitate the timeliness of the HARQ-aided NOMA network. Simulation results verify the superiority of the proposed intelligent scheme and demonstrate the threshold structure of the retransmission policy. Also, answers to whether user pairing is necessary are discussed by extensive simulation results.
翻译:近期AI-6G技术的深度融合引发了如何进一步增强可靠与及时通信的广泛研究兴趣。信息年龄作为一项融合可靠性、延迟与更新频率复杂权衡的新型综合指标,自提出以来便得到了充分研究。本文在该领域贡献了新成果:采用深度强化学习方法智能化决策如何分配功率资源以及在何时进行重传,针对一个面向信息新鲜度的下行多用户混合自动重传请求合并型非正交多址接入网络。具体而言,我们将信息年龄最小化问题建模为马尔可夫决策过程。随后,为实现确定性、年龄最优且智能化的功率分配与重传决策,采用双决斗深度Q网络。此外,提出一种更灵活的重传方案(任意重传方案),以进一步促进HARQ辅助NOMA网络的时效性。仿真结果验证了所提智能方案的优越性,并揭示了重传策略的阈值结构。同时,通过大量仿真结果讨论了用户配对是否必要的问题。