Robust Principal Component Analysis (RPCA) is a widely used method for recovering low-rank structure from data matrices corrupted by significant and sparse outliers. These corruptions may arise from occlusions, malicious tampering, or other causes for anomalies, and the joint identification of such corruptions with low-rank background is critical for process monitoring and diagnosis. However, existing RPCA methods and their extensions largely do not account for the underlying probabilistic distribution for the data matrices, which in many applications are known and can be highly non-Gaussian. We thus propose a new method called Robust Principal Component Analysis for Exponential Family distributions ($e^{\text{RPCA}}$), which can perform the desired decomposition into low-rank and sparse matrices when such a distribution falls within the exponential family. We present a novel alternating direction method of multiplier optimization algorithm for efficient $e^{\text{RPCA}}$ decomposition. The effectiveness of $e^{\text{RPCA}}$ is then demonstrated in two applications: the first for steel sheet defect detection, and the second for crime activity monitoring in the Atlanta metropolitan area.
翻译:鲁棒主成分分析(RPCA)是一种广泛使用的方法,用于从受显著且稀疏异常值污染的数据矩阵中恢复低秩结构。这些污染可能源于遮挡、恶意篡改或其他异常原因,而联合识别此类污染与低秩背景对于过程监控与诊断至关重要。然而,现有的RPCA方法及其扩展大多未考虑数据矩阵的潜在概率分布,而在许多应用中,这种分布是已知的且可能高度非高斯。为此,我们提出了一种新方法——指数族分布的鲁棒主成分分析($e^{\text{RPCA}}$),当分布属于指数族时,该方法可实现所需的低秩与稀疏矩阵分解。我们提出了一种新颖的交替方向乘子法优化算法,以实现高效的$e^{\text{RPCA}}$分解。随后,通过两个应用案例验证了$e^{\text{RPCA}}$的有效性:第一个是钢板缺陷检测,第二个是亚特兰大都市区的犯罪活动监控。