In data-driven control and machine learning, a common requirement involves breaking down large matrices into smaller, low-rank factors that possess specific levels of sparsity. This paper introduces an innovative solution to the orthogonal nonnegative matrix factorization (ONMF) problem. The objective is to approximate input data by using two low-rank nonnegative matrices, adhering to both orthogonality and $\ell_0$-norm sparsity constraints. the proposed maximum-entropy-principle based framework ensures orthogonality and sparsity of features or the mixing matrix, while maintaining nonnegativity in both. Additionally, the methodology offers a quantitative determination of the ``true'' number of underlying features, a crucial hyperparameter for ONMF. Experimental evaluation on synthetic and a standard datasets highlights the method's superiority in terms of sparsity, orthogonality, and computational speed compared to existing approaches. Notably, the proposed method achieves comparable or improved reconstruction errors in line with the literature.
翻译:在数据驱动控制和机器学习中,一个常见需求是将大型矩阵分解为具有特定稀疏度的低秩因子。本文针对正交非负矩阵分解(ONMF)问题提出了一种创新解法。其目标是在正交性和$\ell_0$范数稀疏约束下,利用两个低秩非负矩阵逼近输入数据。所提出的基于最大熵原理的框架确保了特征或混合矩阵的正交性与稀疏性,同时保持两者的非负性。此外,该方法能够定量确定“真实”潜在特征数量——这是ONMF的关键超参数。在合成数据集和标准数据集上的实验评估表明,与现有方法相比,该方法在稀疏度、正交性和计算速度方面具有优越性。值得注意的是,所提方法在重构误差方面与文献报道结果相当或更优。