Laplace learning is a popular machine learning algorithm for finding missing labels from a small number of labelled feature vectors using the geometry of a graph. More precisely, Laplace learning is based on minimising a graph-Dirichlet energy, equivalently a discrete Sobolev $\Wkp{2}{1}$ semi-norm, constrained to taking the values of known labels on a given subset. The variational problem is asymptotically ill-posed as the number of unlabeled feature vectors goes to infinity for finite given labels due to a lack of regularity in minimisers of the continuum Dirichlet energy in any dimension higher than one. In particular, continuum minimisers are not continuous. One solution is to consider higher-order regularisation, which is the analogue of minimising Sobolev $\Wkp{s}{2}$ semi-norms. In this paper we consider the asymptotics of minimising a graph variant of the Sobolev $\Wkp{s}{2}$ semi-norm with pointwise constraints. We show that, as expected, one needs $s>d/2$ where $d$ is the dimension of the data manifold. We also show that there must be an upper bound on the connectivity of the graph; that is, highly connected graphs lead to degenerate behaviour of the minimiser even when $s>d/2$.
翻译:拉普拉斯学习是一种流行的机器学习算法,通过利用图的几何结构从少量标记特征向量中寻找缺失标签。更精确地说,拉普拉斯学习基于最小化一个图-狄利克雷能量(等价于离散Sobolev $\Wkp{2}{1}$半范数),并约束其在给定子集上取已知标签值。当标记数量有限时,随着未标记特征向量数目趋于无穷,该变分问题渐近地表现出不适定性,其根源在于连续狄利克雷能量极小化者在任何高于一维空间中缺乏正则性,具体表现为连续极小化者不连续。一种解决方案是考虑高阶正则化,即对应Sobolev $\Wkp{s}{2}$半范数极小化的类比。本文研究了带点约束的图变体Sobolev $\Wkp{s}{2}$半范数极小化的渐近行为。我们证明,如预期所示,需要满足$s>d/2$,其中$d$为数据流形的维数。此外,我们还发现图的连通性必须存在上界——即高度连通的图即使满足$s>d/2$条件,也会导致极小化者出现退化行为。