We study the problem of estimating a low-rank matrix from noisy measurements, with the specific goal of achieving minimax optimal error. In practice, the problem is commonly solved using non-convex gradient descent, due to its ability to scale to large-scale real-world datasets. In theory, non-convex gradient descent is capable of achieving minimax error. But in practice, it often converges extremely slowly, such that it cannot even deliver estimations of modest accuracy within reasonable time. On the other hand, methods that improve the convergence of non-convex gradient descent, through rescaling or preconditioning, also greatly amplify the measurement noise, resulting in estimations that are orders of magnitude less accurate than what is theoretically achievable with minimax optimal error. In this paper, we propose a slight modification to the usual non-convex gradient descent method that remedies the issue of slow convergence, while provably preserving its minimax optimality. Our proposed algorithm has essentially the same per-iteration cost as non-convex gradient descent, but is guaranteed to converge to minimax error at a linear rate that is immune to ill-conditioning. Using our proposed algorithm, we reconstruct a 60 megapixel dataset for a medical imaging application, and observe significantly decreased reconstruction error compared to previous approaches.
翻译:本文研究从含噪测量中估计低秩矩阵的问题,具体目标是在极小极大意义下实现最优误差。实际应用中,由于非凸梯度下降方法能够扩展至大规模真实世界数据集,该问题通常通过此方法求解。理论上,非凸梯度下降方法可实现极小极大误差,但在实际中往往收敛极慢,甚至无法在合理时间内给出中等精度的估计。另一方面,通过重新缩放或预条件处理来改善非凸梯度下降收敛性的方法,会显著放大测量噪声,导致估计精度比理论上可实现的小极大最优误差低数个数量级。本文对常规非凸梯度下降方法提出轻微修改,既解决了收敛缓慢的问题,又能在可证明前提下保持其极小极大最优性。所提算法每次迭代的计算成本与非凸梯度下降基本一致,但保证以线性速率收敛至极小极大误差,且该收敛速度不受病态条件影响。通过所提算法,我们为医学成像应用重建了6000万像素数据集,观察到重建误差较以往方法显著降低。