We present a novel method for matrix completion, specifically designed for matrices where one dimension significantly exceeds the other. Our Columns Selected Matrix Completion (CSMC) method combines Column Subset Selection and Low-Rank Matrix Completion to efficiently reconstruct incomplete datasets. In each step, CSMC solves a convex optimization task. We introduce two algorithms that implement CSMC, each tailored to different problem sizes. A formal analysis outlines the necessary assumptions and the probability of a correct solution. To assess the impact of matrix size, rank, and the proportion of missing entries on solution quality and computation time, we conducted experiments on synthetic data. The method was applied to two real-world problems: recommendation systems and image inpainting. Our results show that CSMC delivers solutions comparable to state-of-the-art matrix completion algorithms based on convex optimization, but with significant runtime savings. This makes CSMC especially valuable for systems that require efficient processing of large, incomplete datasets while maintaining the integrity of the derived insights.
翻译:本文提出了一种新颖的矩阵补全方法,专门针对矩阵某一维度显著大于另一维度的情形。我们提出的列选择矩阵补全(CSMC)方法结合了列子集选择与低秩矩阵补全技术,能够高效重构不完整数据集。在每一步迭代中,CSMC通过求解凸优化任务实现补全。我们提出了两种实现CSMC的算法,分别适用于不同规模的问题。形式化分析阐明了方法所需的基本假设及获得正确解的概率。为评估矩阵尺寸、秩以及缺失条目比例对求解质量与计算时间的影响,我们在合成数据上进行了系统实验。该方法被应用于两个实际问题:推荐系统与图像修复。实验结果表明,CSMC能够提供与基于凸优化的前沿矩阵补全算法相当的解决方案,同时显著节约计算时间。这使得CSMC特别适用于需要高效处理大规模不完整数据集,同时保持所获洞察完整性的系统。