Sparse coding refers to modeling a signal as sparse linear combinations of the elements of a learned dictionary. Sparse coding has proven to be a successful and interpretable approach in many applications, such as signal processing, computer vision, and medical imaging. While this success has spurred much work on sparse coding with provable guarantees, work on the setting where the learned dictionary is larger (or \textit{over-realized}) with respect to the ground truth is comparatively nascent. Existing theoretical results in the over-realized regime are limited to the case of noise-less data. In this paper, we show that for over-realized sparse coding in the presence of noise, minimizing the standard dictionary learning objective can fail to recover the ground-truth dictionary, regardless of the magnitude of the signal in the data-generating process. Furthermore, drawing from the growing body of work on self-supervised learning, we propose a novel masking objective and we prove that minimizing this new objective can recover the ground-truth dictionary. We corroborate our theoretical results with experiments across several parameter regimes, showing that our proposed objective enjoys better empirical performance than the standard reconstruction objective.
翻译:稀疏编码是指将信号建模为学习字典中元素稀疏线性组合的方法。稀疏编码已在信号处理、计算机视觉和医学影像等诸多应用中展现出成功且可解释的特性。尽管这一成功推动了大量关于具备可证明保证的稀疏编码研究,但在学习字典相对于真实字典更大(即"过完备")的情形下,相关研究仍相对新兴。现有过完备稀疏编码的理论结果仅限于无噪声数据场景。本文表明,对于含噪声的过完备稀疏编码,无论数据生成过程中信号幅度如何,最小化标准字典学习目标均无法恢复真实字典。进一步,借鉴自监督学习领域日益增长的研究成果,我们提出一种新颖的遮蔽目标,并证明最小化该新目标能够恢复真实字典。我们通过跨多个参数区间的实验验证理论结果,表明所提目标相较于标准重构目标具有更优的实证表现。