Learning using statistical invariants (LUSI) is a new learning paradigm, which adopts weak convergence mechanism, and can be applied to a wider range of classification problems. However, the computation cost of invariant matrices in LUSI is high for large-scale datasets during training. To settle this issue, this paper introduces a granularity statistical invariant for LUSI, and develops a new learning paradigm called learning using granularity statistical invariants (LUGSI). LUGSI employs both strong and weak convergence mechanisms, taking a perspective of minimizing expected risk. As far as we know, it is the first time to construct granularity statistical invariants. Compared to LUSI, the introduction of this new statistical invariant brings two advantages. Firstly, it enhances the structural information of the data. Secondly, LUGSI transforms a large invariant matrix into a smaller one by maximizing the distance between classes, achieving feasibility for large-scale datasets classification problems and significantly enhancing the training speed of model operations. Experimental results indicate that LUGSI not only exhibits improved generalization capabilities but also demonstrates faster training speed, particularly for large-scale datasets.
翻译:基于统计不变量的学习(LUSI)是一种新的学习范式,它采用弱收敛机制,可应用于更广泛的分类问题。然而,在大规模数据集的训练过程中,LUSI中的不变量矩阵计算成本较高。为解决此问题,本文引入了一种粒度统计不变量用于LUSI,并发展了一种新的学习范式——基于粒度统计不变量的学习(LUGSI)。LUGSI同时采用强收敛与弱收敛机制,从期望风险最小化的视角出发。据我们所知,这是首次构建粒度统计不变量。与LUSI相比,这种新统计不变量的引入带来了两大优势:首先,它增强了数据的结构信息;其次,LUGSI通过最大化类间距离将大规模不变量矩阵转化为小规模矩阵,从而使得大规模数据集分类问题变得可行,并显著提升了模型运算的训练速度。实验结果表明,LUGSI不仅展现出更优的泛化能力,而且训练速度更快,尤其适用于大规模数据集。