Using Non-negative Matrix Factorization (NMF), the observed matrix can be approximated by the product of the basis and coefficient matrices. Moreover, if the coefficient vectors are explained by the covariates for each individual, the coefficient matrix can be written as the product of the parameter matrix and the covariate matrix, and additionally described in the framework of Non-negative Matrix tri-Factorization (tri-NMF) with covariates. Consequently, this is equal to the mean structure of the Growth Curve Model (GCM). The difference is that the basis matrix for GCM is given by the analyst, whereas that for NMF with covariates is unknown and optimized. In this study, we applied NMF with covariance to longitudinal data and compared it with GCM. We have also published an R package that implements this method, and we show how to use it through examples of data analyses including longitudinal measurement, spatiotemporal data and text data. In particular, we demonstrate the usefulness of Gaussian kernel functions as covariates.
翻译:采用非负矩阵分解(Non-negative Matrix Factorization, NMF)方法,观测矩阵可近似表示为基矩阵与系数矩阵的乘积。当每个个体的系数向量可由协变量解释时,系数矩阵可进一步分解为参数矩阵与协变量矩阵的乘积,从而纳入含协变量的非负矩阵三因子分解(tri-NMF)框架。这一结构等价于增长曲线模型(Growth Curve Model, GCM)的均值结构。两者区别在于:GCM的基矩阵由分析者预先设定,而含协变量的NMF基矩阵则是未知且需优化的。本研究将含协变量的NMF应用于纵向数据,并与GCM进行比较。我们同时发布了实现该方法的R语言软件包,并通过纵向测量数据、时空数据及文本数据的分析实例展示其使用方法。特别地,我们验证了高斯核函数作为协变量的有效性。