The nonconvex formulation of matrix completion problem has received significant attention in recent years due to its affordable complexity compared to the convex formulation. Gradient descent (GD) is the simplest yet efficient baseline algorithm for solving nonconvex optimization problems. The success of GD has been witnessed in many different problems in both theory and practice when it is combined with random initialization. However, previous works on matrix completion require either careful initialization or regularizers to prove the convergence of GD. In this work, we study the rank-1 symmetric matrix completion and prove that GD converges to the ground truth when small random initialization is used. We show that in logarithmic amount of iterations, the trajectory enters the region where local convergence occurs. We provide an upper bound on the initialization size that is sufficient to guarantee the convergence and show that a larger initialization can be used as more samples are available. We observe that implicit regularization effect of GD plays a critical role in the analysis, and for the entire trajectory, it prevents each entry from becoming much larger than the others.
翻译:非凸形式的矩阵补全问题因其相对于凸形式更低的计算复杂度,近年来受到广泛关注。梯度下降法是解决非凸优化问题中最简单且高效的基准算法。结合随机初始化时,梯度下降法在理论与实践中的诸多问题中均展现出显著成效。然而,现有关于矩阵补全的研究在证明梯度下降法收敛性时,要么需要精心设计的初始化,要么需要引入正则化项。本文研究秩一对称矩阵补全问题,证明采用小随机初始化时梯度下降法能收敛至真实解。我们证明在对数级别的迭代次数内,轨迹将进入局部收敛区域。本文给出了足以保证收敛性的初始化规模上界,并表明当可用样本量增加时可采用更大规模的初始化。我们观察到梯度下降法的隐式正则化效应在分析中起关键作用,且在整个轨迹过程中,该效应可防止各元素显著大于其他元素。