Invariant Coordinate Selection (ICS) is a multivariate technique that relies on the simultaneous diagonalization of two scatter matrices. It serves various purposes, including its use as a dimension reduction tool prior to clustering or outlier detection. ICS's theoretical foundation establishes why and when the identified subspace should contain relevant information by demonstrating its connection with the Fisher discriminant subspace (FDS). These general results have been examined in detail primarily for specific scatter combinations within a two-cluster framework. In this study, we expand these investigations to include more clusters and scatter combinations. Our analysis reveals the importance of distinguishing whether the group centers matrix has full rank. In the full-rank case, we establish deeper connections between ICS and FDS. We provide a detailed study of these relationships for three clusters when the group centers matrix has full rank and when it does not. Based on these expanded theoretical insights and supported by numerical studies, we conclude that ICS is indeed suitable for recovering the FDS under very general settings and cases of failure seem rare.
翻译:不变坐标选择(ICS)是一种基于两个散布矩阵同时对角化的多元统计技术。该方法具有多种用途,包括在聚类或异常值检测前作为降维工具使用。ICS的理论基础通过揭示其与Fisher判别子空间(FDS)的关联,确立了所识别子空间为何及何时应包含相关信息。这些一般性结果主要在双聚类框架下针对特定散布矩阵组合进行了详细研究。在本研究中,我们将这些研究拓展至更多聚类及散布矩阵组合。我们的分析揭示了区分组中心矩阵是否满秩的重要性。在满秩情形下,我们建立了ICS与FDS之间更深刻的联系。我们针对三聚类情形,分别在组中心矩阵满秩与非满秩条件下对这些关系进行了详细研究。基于这些拓展的理论认识并结合数值研究,我们得出结论:ICS在非常一般的设定下确实适用于恢复FDS,且失效情形似乎较为罕见。