End-to-End (E2E) learning-based concept has been recently introduced to jointly optimize both the transmitter and the receiver in wireless communication systems. Unfortunately, this E2E learning architecture requires a prior differentiable channel model to jointly train the deep neural networks (DNNs) at the transceivers, which is hardly obtained in practice. This paper aims to solve this issue by developing a deep deterministic policy gradient (DDPG)-based framework. In particular, the proposed solution uses the loss value of the receiver DNN as the reward to train the transmitter DNN. The simulation results then show that our proposed solution can jointly train the transmitter and the receiver without requiring the prior channel model. In addition, we demonstrate that the proposed DDPG-based solution can achieve better detection performance compared to the state-of-the-art solutions.
翻译:近年来,基于端到端(E2E)学习的概念被引入,用于联合优化无线通信系统中的发射机和接收机。然而,这种端到端学习架构需要先验的可微信道模型来联合训练收发两端的深度神经网络(DNN),这在实践中难以获得。本文旨在通过开发基于深度确定性策略梯度(DDPG)的框架来解决这一问题。具体而言,所提出的解决方案将接收端DNN的损失值作为奖励来训练发射端DNN。仿真结果表明,所提出的方案能够在无需先验信道模型的情况下联合训练发射机与接收机。此外,我们证明与现有最优方案相比,基于DDPG的解决方案能够实现更优的检测性能。