Matrix-variate distributions are a recent addition to the model-based clustering field, thereby making it possible to analyze data in matrix form with complex structure such as images and time series. Due to its recent appearance, there is limited literature on matrix-variate data, with even less on dealing with outliers in these models. An approach for clustering matrix-variate normal data with outliers is discussed. The approach, which uses the distribution of subset log-likelihoods, extends the OCLUST algorithm to matrix-variate normal data and uses an iterative approach to detect and trim outliers.
翻译:矩阵变量分布是模型基聚类领域的新增内容,使得能够分析具有复杂结构(如图像和时间序列)的矩阵形式数据。由于该领域出现较晚,关于矩阵变量数据的文献有限,涉及这些模型中异常值处理的文献更是少之又少。本文讨论了一种对含异常值的矩阵变量正态数据进行聚类的方法。该方法利用子集对数似然分布,将OCLUST算法扩展至矩阵变量正态数据,并通过迭代方式检测并剔除异常值。