We study the completion of approximately low rank matrices with entries missing not at random (MNAR). In the context of typical large-dimensional statistical settings, we establish a framework for the performance analysis of the nuclear norm minimization ($\ell_1^*$) algorithm. Our framework produces \emph{exact} estimates of the worst-case residual root mean squared error and the associated phase transitions (PT), with both exhibiting remarkably simple characterizations. Our results enable to {\it precisely} quantify the impact of key system parameters, including data heterogeneity, size of the missing block, and deviation from ideal low rankness, on the accuracy of $\ell_1^*$-based matrix completion. To validate our theoretical worst-case RMSE estimates, we conduct numerical simulations, demonstrating close agreement with their numerical counterparts.
翻译:我们研究了非随机缺失(MNAR)条目下近似低秩矩阵的补全问题。在典型的高维统计设定下,我们建立了一个用于核范数最小化($\ell_1^*$)算法性能分析的理论框架。该框架能够生成最坏情况下残差均方根误差及其相关相变(PT)的精确估计,且两者均呈现显著简洁的刻画形式。我们的研究结果能够精确定量化数据异质性、缺失块尺寸及对理想低秩性的偏离程度等关键系统参数对基于$\ell_1^*$的矩阵补全精度的影响。为验证理论最坏情况下的RMSE估计,我们进行了数值仿真,结果表明其与数值计算结果高度吻合。