We introduce a new method based on nonnegative matrix factorization, Neural NMF, for detecting latent hierarchical structure in data. Datasets with hierarchical structure arise in a wide variety of fields, such as document classification, image processing, and bioinformatics. Neural NMF recursively applies NMF in layers to discover overarching topics encompassing the lower-level features. We derive a backpropagation optimization scheme that allows us to frame hierarchical NMF as a neural network. We test Neural NMF on a synthetic hierarchical dataset, the 20 Newsgroups dataset, and the MyLymeData symptoms dataset. Numerical results demonstrate that Neural NMF outperforms other hierarchical NMF methods on these data sets and offers better learned hierarchical structure and interpretability of topics.
翻译:我们提出了一种基于非负矩阵分解的新方法——神经非负矩阵分解,用于检测数据中的潜在分层结构。具有分层结构的数据集广泛存在于文档分类、图像处理和生物信息学等多个领域。神经非负矩阵分解通过分层递归应用非负矩阵分解,来发现涵盖低层特征的总体主题。我们推导了一种反向传播优化方案,从而能够将分层非负矩阵分解框架化为神经网络。我们在合成分层数据集、20个新闻组数据集以及MyLymeData症状数据集上测试了神经非负矩阵分解。数值结果表明,神经非负矩阵分解在这些数据集上优于其他分层非负矩阵分解方法,并能提供更好的所学分层结构和主题可解释性。