Nonnegative Matrix Factorization (NMF) is an important unsupervised learning method to extract meaningful features from data. To address the NMF problem within a polynomial time framework, researchers have introduced a separability assumption, which has recently evolved into the concept of coseparability. This advancement offers a more efficient core representation for the original data. However, in the real world, the data is more natural to be represented as a multi-dimensional array, such as images or videos. The NMF's application to high-dimensional data involves vectorization, which risks losing essential multi-dimensional correlations. To retain these inherent correlations in the data, we turn to tensors (multidimensional arrays) and leverage the tensor t-product. This approach extends the coseparable NMF to the tensor setting, creating what we term coseparable Nonnegative Tensor Factorization (NTF). In this work, we provide an alternating index selection method to select the coseparable core. Furthermore, we validate the t-CUR sampling theory and integrate it with the tensor Discrete Empirical Interpolation Method (t-DEIM) to introduce an alternative, randomized index selection process. These methods have been tested on both synthetic and facial analysis datasets. The results demonstrate the efficiency of coseparable NTF when compared to coseparable NMF.
翻译:非负矩阵分解(NMF)是一种重要的无监督学习方法,用于从数据中提取有意义的特征。为了在多项式时间框架内解决NMF问题,研究者引入了可分离性假设,该假设最近已发展为协可分离性概念。这一进展为原始数据提供了更高效的核心表示。然而,现实世界中的数据更自然地以多维数组形式表示,例如图像或视频。将NMF应用于高维数据需要进行向量化处理,这可能导致丢失关键的多维相关性。为了保留数据中这些固有的相关性,我们转向张量(多维数组)并利用张量t-积。该方法将协可分离NMF扩展到张量场景,从而构建了所谓的协可分离非负张量分解(NTF)。在本工作中,我们提出了一种交替索引选择方法来选取协可分离核心。此外,我们验证了t-CUR采样理论,并将其与张量离散经验插值方法(t-DEIM)相结合,引入了一种替代性的随机化索引选择过程。这些方法已在合成数据集和面部分析数据集上进行了测试。结果表明,与协可分离NMF相比,协可分离NTF具有更高的效率。