The trade-off between regret and computational cost is a fundamental problem for online kernel regression, and previous algorithms worked on the trade-off can not keep optimal regret bounds at a sublinear computational complexity. In this paper, we propose two new algorithms, AOGD-ALD and NONS-ALD, which can keep nearly optimal regret bounds at a sublinear computational complexity, and give sufficient conditions under which our algorithms work. Both algorithms dynamically maintain a group of nearly orthogonal basis used to approximate the kernel mapping, and keep nearly optimal regret bounds by controlling the approximate error. The number of basis depends on the approximate error and the decay rate of eigenvalues of the kernel matrix. If the eigenvalues decay exponentially, then AOGD-ALD and NONS-ALD separately achieves a regret of $O(\sqrt{L(f)})$ and $O(\mathrm{d}_{\mathrm{eff}}(\mu)\ln{T})$ at a computational complexity in $O(\ln^2{T})$. If the eigenvalues decay polynomially with degree $p\geq 1$, then our algorithms keep the same regret bounds at a computational complexity in $o(T)$ in the case of $p>4$ and $p\geq 10$, respectively. $L(f)$ is the cumulative losses of $f$ and $\mathrm{d}_{\mathrm{eff}}(\mu)$ is the effective dimension of the problem. The two regret bounds are nearly optimal and are not comparable.
翻译:在线核回归中,遗憾值与计算开销之间的权衡是一个基本问题,以往针对该权衡的算法无法在次线性计算复杂度下保持最优遗憾界。本文提出两种新算法AOGD-ALD与NONS-ALD,它们能在次线性计算复杂度下保持近最优遗憾界,并给出了算法有效运行的充分条件。两种算法动态维护一组近乎正交的基函数以逼近核映射,并通过控制逼近误差来保持近最优遗憾界。基函数的数量取决于逼近误差及核矩阵特征值的衰减速率:若特征值呈指数衰减,则AOGD-ALD与NONS-ALD在$O(\ln^2{T})$的计算复杂度下分别实现$O(\sqrt{L(f)})$与$O(\mathrm{d}_{\mathrm{eff}}(\mu)\ln{T})$的遗憾界;若特征值呈$p\geq 1$阶多项式衰减,则在$p>4$和$p\geq 10$的情况下,两种算法分别保持相同遗憾界且计算复杂度为$o(T)$。其中$L(f)$为函数$f$的累积损失,$\mathrm{d}_{\mathrm{eff}}(\mu)$为问题的有效维度。这两个遗憾界均为近最优且不可比较。