DDPG-Driven Deep-Unfolding with Adaptive Depth for Channel Estimation with Sparse Bayesian Learning

Deep-unfolding neural networks (NNs) have received great attention since they achieve satisfactory performance with relatively low complexity. Typically, these deep-unfolding NNs are restricted to a fixed-depth for all inputs. However, the optimal number of layers required for convergence changes with different inputs. In this paper, we first develop a framework of deep deterministic policy gradient (DDPG)-driven deep-unfolding with adaptive depth for different inputs, where the trainable parameters of deep-unfolding NN are learned by DDPG, rather than updated by the stochastic gradient descent algorithm directly. Specifically, the optimization variables, trainable parameters, and architecture of deep-unfolding NN are designed as the state, action, and state transition of DDPG, respectively. Then, this framework is employed to deal with the channel estimation problem in massive multiple-input multiple-output systems. Specifically, first of all we formulate the channel estimation problem with an off-grid basis and develop a sparse Bayesian learning (SBL)-based algorithm to solve it. Secondly, the SBL-based algorithm is unfolded into a layer-wise structure with a set of introduced trainable parameters. Thirdly, the proposed DDPG-driven deep-unfolding framework is employed to solve this channel estimation problem based on the unfolded structure of the SBL-based algorithm. To realize adaptive depth, we design the halting score to indicate when to stop, which is a function of the channel reconstruction error. Furthermore, the proposed framework is extended to realize the adaptive depth of the general deep neural networks (DNNs). Simulation results show that the proposed algorithm outperforms the conventional optimization algorithms and DNNs with fixed depth with much reduced number of layers.

翻译：摘要：深度展开神经网络因其在较低复杂度下实现满意性能而受到广泛关注。通常，这些深度展开网络对所有输入采用固定深度。然而，收敛所需的最优层数随输入变化而不同。本文首先针对不同输入，提出一种基于深度确定性策略梯度（DDPG）驱动的自适应深度深度展开框架，其中深度展开网络的可训练参数由DDPG学习，而非直接通过随机梯度下降算法更新。具体而言，深度展开网络的优化变量、可训练参数及架构分别被设计为DDPG的状态、动作及状态转移。随后，该框架被应用于大规模多输入多输出系统中的信道估计问题。首先，我们基于离网基构建信道估计问题，并开发了基于稀疏贝叶斯学习（SBL）的求解算法。其次，将SBL算法展开为层状结构，并引入一组可训练参数。第三，利用所提出的DDPG驱动的深度展开框架，基于SBL算法的展开结构解决该信道估计问题。为实现自适应深度，我们设计了基于信道重构误差的暂停分数以指示何时停止。此外，所提框架被扩展至通用深度神经网络（DNN）的自适应深度实现。仿真结果表明，所提算法以显著减少的层数超越了传统优化算法和固定深度DNN。