Linear Discriminant Analysis (LDA) is one of the oldest and most popular linear methods for supervised classification problems. In this paper, we demonstrate that it is possible to compute the exact projection vector from LDA models based on unlabelled data, if some minimal prior information is available. More precisely, we show that only one of the following three pieces of information is actually sufficient to compute the LDA projection vector if only unlabelled data are available: (1) the class average of one of the two classes, (2) the difference between both class averages (up to a scaling), or (3) the class covariance matrices (up to a scaling). These theoretical results are validated in numerical experiments, demonstrating that this minimally informed Linear Discriminant Analysis (MILDA) model closely matches the performance of a supervised LDA model. Furthermore, we show that the MILDA projection vector can be computed in a closed form with a computational cost comparable to LDA and is able to quickly adapt to non-stationary data, making it well-suited to use as an adaptive classifier.
翻译:线性判别分析(LDA)是监督分类问题中最古老且最流行的线性方法之一。本文证明,在仅有少量先验信息的情况下,可以基于无标签数据计算出LDA模型的精确投影向量。更精确地说,我们表明,当仅有无标签数据可用时,以下三类信息中任意一项都足以计算LDA投影向量:(1) 两个类别中一个类别的类均值,(2) 两个类别均值之间的差值(允许尺度缩放),或 (3) 类协方差矩阵(允许尺度缩放)。这些理论结果通过数值实验得到验证,表明这种最小信息线性判别分析(MILDA)模型的性能与监督LDA模型高度接近。此外,我们证明MILDA投影向量可以闭式形式计算,计算成本与LDA相当,并且能够快速适应非平稳数据,使其非常适合用作自适应分类器。