Non-negative matrix factorization (NMF) is a dimensionality reduction technique that has shown promise for analyzing noisy data, especially astronomical data. For these datasets, the observed data may contain negative values due to noise even when the true underlying physical signal is strictly positive. Prior NMF work has not treated negative data in a statistically consistent manner, which becomes problematic for low signal-to-noise data with many negative values. In this paper we present two algorithms, Shift-NMF and Nearly-NMF, that can handle both the noisiness of the input data and also any introduced negativity. Both of these algorithms use the negative data space without clipping, and correctly recover non-negative signals without any introduced positive offset that occurs when clipping negative data. We demonstrate this numerically on both simple and more realistic examples, and prove that both algorithms have monotonically decreasing update rules.
翻译:非负矩阵分解(NMF)是一种在分析噪声数据(尤其是天文数据)方面展现出潜力的降维技术。对于这类数据集,尽管真实物理信号严格为正,但由于噪声影响,观测数据可能包含负值。以往的非负矩阵分解研究未能以统计一致的方式处理负值数据,这在信噪比低且负值较多的数据中会引发问题。本文提出两种算法——Shift-NMF和Nearly-NMF,能够同时处理输入数据的噪声特性及引入的负值问题。这两种算法无需裁剪即可利用负值数据空间,且能正确恢复非负信号,避免因裁剪负值数据而产生的人为正偏移。我们通过简单及更具现实意义的数值示例验证了其有效性,并证明两种算法均具有单调递减的更新规则。