When applying nonnegative matrix factorization (NMF), generally the rank parameter is unknown. Such rank in NMF, called the nonnegative rank, is usually estimated heuristically since computing the exact value of it is NP-hard. In this work, we propose an approximation method to estimate such rank while solving NMF on-the-fly. We use sum-of-norm (SON), a group-lasso structure that encourages pairwise similarity, to reduce the rank of a factor matrix where the rank is overestimated at the beginning. On various datasets, SON-NMF is able to reveal the correct nonnegative rank of the data without any prior knowledge nor tuning. SON-NMF is a nonconvx nonsmmoth non-separable non-proximable problem, solving it is nontrivial. First, as rank estimation in NMF is NP-hard, the proposed approach does not enjoy a lower computational complexity. Using a graph-theoretic argument, we prove that the complexity of the SON-NMF is almost irreducible. Second, the per-iteration cost of any algorithm solving SON-NMF is possibly high, which motivated us to propose a first-order BCD algorithm to approximately solve SON-NMF with a low per-iteration cost, in which we do so by the proximal average operator. Lastly, we propose a simple greedy method for post-processing. SON-NMF exhibits favourable features for applications. Beside the ability to automatically estimate the rank from data, SON-NMF can deal with rank-deficient data matrix, can detect weak component with small energy. Furthermore, on the application of hyperspectral imaging, SON-NMF handle the issue of spectral variability naturally.
翻译:在应用非负矩阵分解(NMF)时,其秩参数通常是未知的。这种在NMF中被称为非负秩的秩,由于计算其精确值是NP难的,通常采用启发式方法进行估计。本文提出一种在求解NMF的同时估计该秩的近似方法。我们采用和范数(SON)——一种能促进成对相似性的群套索结构——来降低初始阶段被高估的因子矩阵的秩。在多个数据集上,SON-NMF能够在无需任何先验知识或参数调优的情况下,准确揭示数据的真实非负秩。SON-NMF是一个非凸、非光滑、不可分且非邻近可解的问题,求解具有挑战性。首先,由于NMF中的秩估计是NP难的,所提方法并未降低计算复杂度。通过图论论证,我们证明了SON-NMF的复杂度几乎是不可简化的。其次,任何求解SON-NMF算法的单次迭代成本可能较高,这促使我们提出一种一阶块坐标下降(BCD)算法,以较低的单次迭代成本近似求解SON-NMF,其中我们通过邻近平均算子实现这一目标。最后,我们提出了一种简单的贪心后处理方法。SON-NMF在应用中展现出优越特性:除了能够从数据中自动估计秩之外,它还能处理秩亏数据矩阵,并检测能量较弱的组分。此外,在高光谱成像应用中,SON-NMF能够自然地处理光谱变异问题。