Sparse coding, which refers to modeling a signal as sparse linear combinations of the elements of a learned dictionary, has proven to be a successful (and interpretable) approach in applications such as signal processing, computer vision, and medical imaging. While this success has spurred much work on provable guarantees for dictionary recovery when the learned dictionary is the same size as the ground-truth dictionary, work on the setting where the learned dictionary is larger (or over-realized) with respect to the ground truth is comparatively nascent. Existing theoretical results in this setting have been constrained to the case of noise-less data. We show in this work that, in the presence of noise, minimizing the standard dictionary learning objective can fail to recover the elements of the ground-truth dictionary in the over-realized regime, regardless of the magnitude of the signal in the data-generating process. Furthermore, drawing from the growing body of work on self-supervised learning, we propose a novel masking objective for which recovering the ground-truth dictionary is in fact optimal as the signal increases for a large class of data-generating processes. We corroborate our theoretical results with experiments across several parameter regimes showing that our proposed objective also enjoys better empirical performance than the standard reconstruction objective.
翻译:稀疏编码(即用学习所得字典元素的稀疏线性组合对信号进行建模)已被证明在信号处理、计算机视觉和医学成像等领域是一种成功且可解释的方法。尽管这一成功极大地推动了当学习字典与真实字典维度相同时字典可恢复性的可证明性研究,但针对学习字典大于(或过度实现)真实字典场景的研究仍相对初步。该领域现有理论结果局限于无噪声数据情形。本文证明:在存在噪声的情况下,最小化标准字典学习目标函数将无法在过度实现场景中恢复真实字典元素——无论数据生成过程信号强度如何。此外,借鉴自监督学习的蓬勃研究,我们提出一种新颖的遮蔽目标函数:对于一大类数据生成过程,随着信号强度增加,该函数确实能优化恢复真实字典。我们通过多参数域的实验验证理论结果,证明所提目标函数在实证表现上亦优于标准重建目标函数。