Non-negative matrix factorization (NMF) is a dimensionality reduction technique that has shown promise for analyzing noisy data, especially astronomical data. For these datasets, the observed data may contain negative values due to noise even when the true underlying physical signal is strictly positive. Prior NMF work has not treated negative data in a statistically consistent manner, which becomes problematic for low signal-to-noise data with many negative values. In this paper we present two algorithms, Shift-NMF and Nearly-NMF, that can handle both the noisiness of the input data and also any introduced negativity. Both of these algorithms use the negative data space without clipping, and correctly recover non-negative signals without any introduced positive offset that occurs when clipping negative data. We demonstrate this numerically on both simple and more realistic examples, and prove that both algorithms have monotonically decreasing update rules.
翻译:非负矩阵分解(NMF)是一种在分析噪声数据(尤其是天文数据)方面展现出潜力的降维技术。对于这类数据集,即使真实的底层物理信号严格为正,观测数据也可能因噪声而包含负值。以往的非负矩阵分解研究未能在统计一致性的框架下处理负值数据,这对于具有大量负值的低信噪比数据会造成问题。本文提出了两种算法——Shift-NMF与Nearly-NMF——能够同时处理输入数据的噪声特性及引入的负值。这两种算法均在不进行截断的情况下利用负值数据空间,并能正确恢复非负信号,避免了截断负值数据时产生的正向偏移。我们通过简单示例与更贴近实际的案例进行了数值验证,并证明两种算法均具有单调递减的更新规则。