In this paper we prove discrete to continuum convergence rates for Poisson Learning, a graph-based semi-supervised learning algorithm that is based on solving the graph Poisson equation with a source term consisting of a linear combination of Dirac deltas located at labeled points and carrying label information. The corresponding continuum equation is a Poisson equation with measure data in a Euclidean domain $\Omega \subset \mathbb{R}^d$. The singular nature of these equations is challenging and requires an approach with several distinct parts: (1) We prove quantitative error estimates when convolving the measure data of a Poisson equation with (approximately) radial function supported on balls. (2) We use quantitative variational techniques to prove discrete to continuum convergence rates on random geometric graphs with bandwidth $\varepsilon>0$ for bounded source terms. (3) We show how to regularize the graph Poisson equation via mollification with the graph heat kernel, and we study fine asymptotics of the heat kernel on random geometric graphs. Combining these three pillars we obtain $L^1$ convergence rates that scale, up to logarithmic factors, like $O(\varepsilon^{\frac{1}{d+2}})$ for general data distributions, and $O(\varepsilon^{\frac{2-\sigma}{d+4}})$ for uniformly distributed data, where $\sigma>0$. These rates are valid with high probability if $\varepsilon\gg\left({\log n}/{n}\right)^q$ where $n$ denotes the number of vertices of the graph and $q \approx \frac{1}{3d}$.
翻译:本文证明了泊松学习(一种基于图的半监督学习算法)的离散到连续收敛速率。该算法通过求解图泊松方程实现,其源项由位于标注点并携带标签信息的狄拉克δ函数的线性组合构成。对应的连续方程是欧几里得区域 $\Omega \subset \mathbb{R}^d$ 中具有测度数据的泊松方程。此类方程的奇异性带来了挑战,需要采用包含多个独立部分的研究方法:(1)我们证明了当用支撑在球上的(近似)径向函数对泊松方程的测度数据进行卷积时的定量误差估计。(2)我们利用定量变分技术,证明了在带宽 $\varepsilon>0$ 的随机几何图上,对于有界源项的离散到连续收敛速率。(3)我们展示了如何通过图热核的磨光来正则化图泊松方程,并研究了随机几何图上热核的精细渐近行为。结合这三个支柱,我们得到了 $L^1$ 收敛速率,其尺度(在对数因子范围内)对于一般数据分布为 $O(\varepsilon^{\frac{1}{d+2}})$,对于均匀分布数据为 $O(\varepsilon^{\frac{2-\sigma}{d+4}})$,其中 $\sigma>0$。若 $\varepsilon\gg\left({\log n}/{n}\right)^q$(其中 $n$ 表示图的顶点数,$q \approx \frac{1}{3d}$),这些速率以高概率成立。