This paper studies the online node classification problem under a transductive learning setting. Current methods either invert a graph kernel matrix with $\mathcal{O}(n^3)$ runtime and $\mathcal{O}(n^2)$ space complexity or sample a large volume of random spanning trees, thus are difficult to scale to large graphs. In this work, we propose an improvement based on the \textit{online relaxation} technique introduced by a series of works (Rakhlin et al.,2012; Rakhlin and Sridharan, 2015; 2017). We first prove an effective regret $\mathcal{O}(\sqrt{n^{1+\gamma}})$ when suitable parameterized graph kernels are chosen, then propose an approximate algorithm FastONL enjoying $\mathcal{O}(k\sqrt{n^{1+\gamma}})$ regret based on this relaxation. The key of FastONL is a \textit{generalized local push} method that effectively approximates inverse matrix columns and applies to a series of popular kernels. Furthermore, the per-prediction cost is $\mathcal{O}(\text{vol}({\mathcal{S}})\log 1/\epsilon)$ locally dependent on the graph with linear memory cost. Experiments show that our scalable method enjoys a better tradeoff between local and global consistency.
翻译:本文研究转导学习设置下的在线节点分类问题。现有方法要么以$\mathcal{O}(n^3)$时间复杂度和$\mathcal{O}(n^2)$空间复杂度求逆图核矩阵,要么采样大量随机生成树,因此难以扩展到大规模图。本文基于一系列工作(Rakhlin等,2012; Rakhlin和Sridharan,2015; 2017)引入的\textit{在线松弛}技术提出改进方案。首先证明当选择适当参数化的图核时,可达到$\mathcal{O}(\sqrt{n^{1+\gamma}})$的有效遗憾界,继而基于该松弛提出一种近似算法FastONL,其遗憾界为$\mathcal{O}(k\sqrt{n^{1+\gamma}})$。FastONL的核心是一种\textit{广义局部推送}方法,该方法可高效近似逆矩阵的列,并适用于一系列流行核函数。此外,每预测代价为$\mathcal{O}(\text{vol}({\mathcal{S}})\log 1/\epsilon)$,该复杂度局部依赖于图结构且具有线性存储开销。实验表明,本可扩展方法在局部与全局一致性之间取得了更优的权衡。