We consider the problem of sparse nonnegative matrix factorization (NMF) using archetypal regularization. The goal is to represent a collection of data points as nonnegative linear combinations of a few nonnegative sparse factors with appealing geometric properties, arising from the use of archetypal regularization. We generalize the notion of robustness studied in Javadi and Montanari (2019) (without sparsity) to the notions of (a) strong robustness that implies each estimated archetype is close to the underlying archetypes and (b) weak robustness that implies there exists at least one recovered archetype that is close to the underlying archetypes. Our theoretical results on robustness guarantees hold under minimal assumptions on the underlying data, and applies to settings where the underlying archetypes need not be sparse. We present theoretical results and illustrative examples to strengthen the insights underlying the notions of robustness. We propose new algorithms for our optimization problem; and present numerical experiments on synthetic and real data sets that shed further insights into our proposed framework and theoretical developments.
翻译:我们研究采用原型正则化的稀疏非负矩阵分解(NMF)问题。其目标是通过使用原型正则化,将数据点集合表示为少量具有良好几何性质的稀疏非负因子的非负线性组合。我们将Javadi与Montanari(2019)中研究的鲁棒性概念(不含稀疏性)推广为:(a)强鲁棒性,即每个估计原型都接近真实原型;(b)弱鲁棒性,即至少存在一个恢复出的原型接近真实原型。我们的鲁棒性保证理论结果在数据底层假设最少的条件下成立,并适用于底层原型不必为稀疏的场景。我们通过理论结果和示例分析来强化对鲁棒性概念的理解。针对优化问题提出新算法,并在合成和真实数据集上开展数值实验,进一步揭示所提框架和理论发展的深刻见解。