Conventional matrix completion methods approximate the missing values by assuming the matrix to be low-rank, which leads to a linear approximation of missing values. It has been shown that enhanced performance could be attained by using nonlinear estimators such as deep neural networks. Deep fully connected neural networks (FCNNs), one of the most suitable architectures for matrix completion, suffer from over-fitting due to their high capacity, which leads to low generalizability. In this paper, we control over-fitting by regularizing the FCNN model in terms of the $\ell_{1}$ norm of intermediate representations and nuclear norm of weight matrices. As such, the resulting regularized objective function becomes nonsmooth and nonconvex, i.e., existing gradient-based methods cannot be applied to our model. We propose a variant of the proximal gradient method and investigate its convergence to a critical point. In the initial epochs of FCNN training, the regularization terms are ignored, and through epochs, the effect of that increases. The gradual addition of nonsmooth regularization terms is the main reason for the better performance of the deep neural network with nonsmooth regularization terms (DNN-NSR) algorithm. Our simulations indicate the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.
翻译:传统矩阵补全方法通过假设矩阵低秩来近似缺失值,这导致对缺失值的线性近似。研究表明,采用深度神经网络等非线性估计器可获得更优性能。深度全连接神经网络作为最适用于矩阵补全的架构之一,因其高容量而存在过拟合问题,导致泛化能力低下。本文通过引入中间表示的$\ell_{1}$范数正则化和权重矩阵的核范数正则化来控制FCNN模型的过拟合。由此产生的正则化目标函数具有非光滑与非凸特性,即现有基于梯度的方法无法应用于本模型。我们提出一种近端梯度法的变体,并研究其收敛到临界点的性质。在FCNN训练的初始阶段忽略正则化项,并随着训练进程逐渐增强其影响。这种非光滑正则化项的渐进式添加是采用非光滑正则化项的深度神经网络算法表现更优的主要原因。仿真结果表明,与现有线性和非线性算法相比,本算法具有显著优越性。