Manifold regularization model is a semi-supervised learning model that leverages the geometric structure of a dataset, comprising a small number of labeled samples and a large number of unlabeled samples, to generate classifiers. However, the original manifold norm limits the performance of models to local regions. To address this limitation, this paper proposes an approach to improve manifold regularization based on a label propagation model. We initially enhance the probability transition matrix of the diffusion map algorithm, which can be used to estimate the Neumann heat kernel, enabling it to accurately depict the label propagation process on the manifold. Using this matrix, we establish a label propagation function on the dataset to describe the distribution of labels at different time steps. Subsequently, we extend the label propagation function to the entire data manifold. We prove that the extended label propagation function converges to a stable distribution after a sufficiently long time and can be considered as a classifier. Building upon this concept, we propose a viable improvement to the manifold regularization model and validate its superiority through experiments.
翻译:流形正则化模型是一种半监督学习模型,它利用由少量标记样本和大量未标记样本组成的数据集的几何结构来生成分类器。然而,原始流形范数将模型性能限制在局部区域。为解决此局限性,本文提出了一种基于标签传播模型改进流形正则化的方法。我们首先改进了扩散图算法的概率转移矩阵(该矩阵可用于估计诺伊曼热核),使其能够准确描述流形上的标签传播过程。利用该矩阵,我们在数据集上建立了一个标签传播函数,以描述不同时间步的标签分布。随后,我们将标签传播函数扩展到整个数据流形。我们证明了扩展后的标签传播函数在足够长时间后会收敛到稳定分布,并可视为分类器。基于此思想,我们提出了流形正则化模型的一种可行改进方案,并通过实验验证了其优越性。