Multilinear Principal Component Analysis (MPCA) is an important tool for analyzing tensor data. It performs dimension reduction similar to PCA for multivariate data. However, standard MPCA is sensitive to outliers. It is highly influenced by observations deviating from the bulk of the data, called casewise outliers, as well as by individual outlying cells in the tensors, so-called cellwise outliers. This latter type of outlier is highly likely to occur in tensor data, as tensors typically consist of many cells. This paper introduces a novel robust MPCA method that can handle both types of outliers simultaneously, and can cope with missing values as well. This method uses a single loss function to reduce the influence of both casewise and cellwise outliers. The solution that minimizes this loss function is computed using an iteratively reweighted least squares algorithm with a robust initialization. Graphical diagnostic tools are also proposed to identify the different types of outliers that have been found by the new robust MPCA method. The performance of the method and associated graphical displays is assessed through simulations and illustrated on two real datasets.
翻译:多线性主成分分析(MPCA)是处理张量数据的重要工具。它类似于多元数据的主成分分析(PCA),能够实现降维。然而,标准MPCA对异常值敏感,容易受到偏离数据主体的观测值(称为案例异常值)以及张量中个别异常单元(即单元异常值)的影响。后一类异常值在张量数据中极易出现,因为张量通常包含大量单元。本文提出了一种新颖的稳健MPCA方法,能够同时处理这两类异常值,并能够应对缺失值。该方法采用单一损失函数来降低案例异常值和单元异常值的影响。通过结合稳健初始化的迭代重加权最小二乘算法,计算最小化该损失函数的解。此外,本文还提出了图形诊断工具,用于识别新稳健MPCA方法所发现的不同类型异常值。通过模拟实验和两个真实数据集的实例分析,评估了该方法及其相关图形展示的性能。