Nonnegative Matrix Factorization (NMF) is an important unsupervised learning method to extract meaningful features from data. To address the NMF problem within a polynomial time framework, researchers have introduced a separability assumption, which has recently evolved into the concept of coseparability. This advancement offers a more efficient core representation for the original data. However, in the real world, the data is more natural to be represented as a multi-dimensional array, such as images or videos. The NMF's application to high-dimensional data involves vectorization, which risks losing essential multi-dimensional correlations. To retain these inherent correlations in the data, we turn to tensors (multidimensional arrays) and leverage the tensor t-product. This approach extends the coseparable NMF to the tensor setting, creating what we term coseparable Nonnegative Tensor Factorization (NTF). In this work, we provide an alternating index selection method to select the coseparable core. Furthermore, we validate the t-CUR sampling theory and integrate it with the tensor Discrete Empirical Interpolation Method (t-DEIM) to introduce an alternative, randomized index selection process. These methods have been tested on both synthetic and facial analysis datasets. The results demonstrate the efficiency of coseparable NTF when compared to coseparable NMF.
翻译:非负矩阵分解(NMF)是一种重要的无监督学习方法,用于从数据中提取有意义的特征。为在多项式时间框架内解决NMF问题,研究者引入了可分离性假设,该假设近期演化为共可分离性概念。这一进展为原始数据提供了更高效的核心表示。然而,现实世界中的数据更自然地表示为多维数组(如图像或视频)。NMF在高维数据中的应用涉及向量化处理,这可能导致丢失关键的多维关联性。为保留数据中的固有关联性,我们转向张量(多维数组)并利用张量t-乘积。该方法将共可分离NMF拓展至张量框架,形成了我们称之为共可分离非负张量分解(NTF)的方法。在本工作中,我们提出了一种交替索引选择方法来选取共可分离核心。此外,我们验证了t-CUR采样理论,并将其与张量离散经验插值方法(t-DEIM)相结合,引入了一种替代性的随机索引选择过程。这些方法已在合成数据集和人脸分析数据集上进行了测试。结果表明,与共可分离NMF相比,共可分离NTF具有更高的效率。