The End-to-end (E2E) learning-based approach has great potential to reshape the existing communication systems by replacing the transceivers with deep neural networks. To this end, the E2E learning approach needs to assume the availability of prior channel information to mathematically formulate a differentiable channel layer for the backpropagation (BP) of the error gradients, thereby jointly optimizing the transmitter and the receiver. However, accurate and instantaneous channel state information is hardly obtained in practical wireless communication scenarios. Moreover, the existing E2E learning-based solutions exhibit limited performance in data transmissions with large block lengths. In this article, these practical issues are addressed by our proposed deep deterministic policy gradient-based E2E communication system. In particular, the proposed solution utilizes a reward feedback mechanism to train both the transmitter and the receiver, which alleviates the information loss of error gradients during BP. In addition, a convolutional neural network (CNN)-based architecture is developed to mitigate the curse of dimensionality problem when transmitting messages with large block lengths. Extensive simulations then demonstrate that our proposed solution can not only jointly train the transmitter and the receiver simultaneously without requiring the prior channel knowledge but also can obtain significant performance improvement on block error rate compared to state-of-the-art solutions.
翻译:基于端到端学习的方法通过用深度神经网络替代收发机,具有重塑现有通信系统的巨大潜力。为此,端到端学习方法需要假设先验信道信息的可用性,从而在数学上构建可微分的信道层用于误差梯度的反向传播,进而联合优化发射机和接收机。然而,在实际无线通信场景中,准确且瞬时的信道状态信息难以获取。此外,现有基于端到端学习的解决方案在传输大块长度数据时性能有限。本文通过提出的基于深度确定性策略梯度的端到端通信系统解决了这些实际问题。具体而言,所提方案利用奖励反馈机制训练发射机和接收机,缓解了反向传播过程中误差梯度的信息损失。同时,开发了基于卷积神经网络(CNN)的架构,以缓解传输大块长度消息时的维度灾难问题。大量仿真表明,所提方案不仅能无需先验信道知识即可联合训练发射机和接收机,而且相比现有最优解决方案,在误块率上获得了显著的性能提升。