Non-negative matrix factorization (NMF) is a dimensionality reduction technique that has shown promise for analyzing noisy data, especially astronomical data. For these datasets, the observed data may contain negative values due to noise even when the true underlying physical signal is strictly positive. Prior NMF work has not treated negative data in a statistically consistent manner, which becomes problematic for low signal-to-noise data with many negative values. In this paper we present two algorithms, Shift-NMF and Nearly-NMF, that can handle both the noisiness of the input data and also any introduced negativity. Both of these algorithms use the negative data space without clipping, and correctly recover non-negative signals without any introduced positive offset that occurs when clipping negative data. We demonstrate this numerically on both simple and more realistic examples, and prove that both algorithms have monotonically decreasing update rules.
翻译:非负矩阵分解(NMF)是一种在含噪声数据分析中展现出潜力的降维技术,尤其适用于天文数据。对于此类数据集,当真实物理信号严格为正时,观测数据可能因噪声而包含负值。现有NMF研究未能在统计一致性上处理负值数据,这在低信噪比且包含大量负值的数据中存在问题。本文提出两种算法——Shift-NMF和Nearly-NMF,它们既能处理输入数据的噪声特性,也能应对引入的负值。这两种算法无需截断即可利用负值数据空间,且能正确恢复非负信号,不会像截断负值数据时那样引入正偏移。我们通过简单及更贴近实际的算例进行数值验证,并证明两种算法的更新规则均具有单调递减特性。