Matrix-variate distributions are a recent addition to the model-based clustering field, thereby making it possible to analyze data in matrix form with complex structure such as images and time series. Due to its recent appearance, there is limited literature on matrix-variate data, with even less on dealing with outliers in these models. An approach for clustering matrix-variate normal data with outliers is discussed. The approach, which uses the distribution of subset log-likelihoods, extends the OCLUST algorithm to matrix-variate normal data and uses an iterative approach to detect and trim outliers.
翻译:矩阵变量分布是模型聚类领域的最新进展,使得分析具有复杂结构(如图像和时间序列)的矩阵形式数据成为可能。由于该分布模型出现较晚,目前关于矩阵变量数据的文献有限,而涉及此类模型中异常值处理的研究则更为匮乏。本文探讨了一种针对含异常值的矩阵变量正态数据的聚类方法。该方法利用子集对数似然分布,将OCLUST算法扩展至矩阵变量正态数据,并采用迭代方法检测与剔除异常值。