Matrix completion aims to estimate missing entries in a data matrix, using the assumption of a low-complexity structure (e.g., low rank) so that imputation is possible. While many effective estimation algorithms exist in the literature, uncertainty quantification for this problem has proved to be challenging, and existing methods are extremely sensitive to model misspecification. In this work, we propose a distribution-free method for predictive inference in the matrix completion problem. Our method adapts the framework of conformal prediction, which provides confidence intervals with guaranteed distribution-free validity in the setting of regression, to the problem of matrix completion. Our resulting method, conformalized matrix completion (cmc), offers provable predictive coverage regardless of the accuracy of the low-rank model. Empirical results on simulated and real data demonstrate that cmc is robust to model misspecification while matching the performance of existing model-based methods when the model is correct.
翻译:矩阵补全旨在利用数据矩阵的低复杂度结构(如低秩)假设来估计缺失条目,从而使插补成为可能。尽管文献中存在许多有效的估计算法,但该问题的不确定性量化已被证明颇具挑战性,且现有方法对模型错误指定极为敏感。在本研究中,我们提出了一种无分布假设的方法,用于矩阵补全问题的预测推断。该方法将保形预测框架(该框架在回归设置中提供了保证无分布假设有效性的置信区间)适配至矩阵补全问题。最终得到的符合化矩阵补全方法(cmc)无论低秩模型的准确度如何,都能提供可证明的预测覆盖率。在模拟和真实数据上的实证结果表明,当模型正确时,cmc在匹配现有基于模型方法性能的同时,对模型错误指定具有鲁棒性。