Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity structure or data splitting are examined. Regarding the estimation of mean vectors, Non-parametric Empirical Bayes (NPEB) methods and Non-parametric Maximum Likelihood Estimation (NPMLE) methods, also known as f-modeling and g-modeling, respectively, are adopted. The performance of linear discriminant rules based on combined estimation strategies of the covariance matrix and mean vectors are analyzed in this study. Particularly, the study presents a theoretical result on the performance of the NPEB method and compares it with previous studies. Simulation studies with various covariance matrices and mean vector structures are conducted to evaluate the methods discussed in this paper. Furthermore, real data examples such as gene expressions and EEG data are also presented
翻译:线性判别分析(LDA)是处理大维度小样本分类问题的典型方法。基于协方差矩阵和均值向量的不同估计方式,存在多种LDA方法。本文考虑基于非参数方法的收缩技术。针对精度矩阵,我们检验了基于稀疏结构或数据分割的方法;在均值向量估计方面,则采用了非参数经验贝叶斯(NPEB)方法与非参数最大似然估计(NPMLE)方法(分别对应f-建模与g-建模)。本研究分析了基于协方差矩阵与均值向量联合估计策略的线性判别规则性能。特别地,本文给出了NPEB方法性能的理论结果,并与既往研究进行了对比。通过设置多种协方差矩阵与均值向量结构的模拟研究,评估了文中讨论的方法。此外,还展示了基因表达数据与脑电图(EEG)数据等真实数据案例。