We give improved tradeoffs between space and regret for the online learning with expert advice problem over $T$ days with $n$ experts. Given a space budget of $n^{\delta}$ for $\delta \in (0,1)$, we provide an algorithm achieving regret $\tilde{O}(n^2 T^{1/(1+\delta)})$, improving upon the regret bound $\tilde{O}(n^2 T^{2/(2+\delta)})$ in the recent work of [PZ23]. The improvement is particularly salient in the regime $\delta \rightarrow 1$ where the regret of our algorithm approaches $\tilde{O}_n(\sqrt{T})$, matching the $T$ dependence in the standard online setting without space restrictions.
翻译:针对在线专家建议问题,我们给出了在$T$天、$n$个专家情况下空间与遗憾值之间的改进权衡。对于空间预算$n^{\delta}$($\delta \in (0,1)$),我们提出了一种算法,其遗憾值达到$\tilde{O}(n^2 T^{1/(1+\delta)})$,改进了[PZ23]近期工作中$\tilde{O}(n^2 T^{2/(2+\delta)})$的遗憾界。这一改进在$\delta \rightarrow 1$的区间内尤为显著,此时我们的算法遗憾值趋近于$\tilde{O}_n(\sqrt{T})$,与无空间限制的标准在线设置中$T$的依赖关系相匹配。