We consider Kernelized Bandits (KBs) to optimize a function $f : \mathcal{X} \rightarrow [0,1]$ belonging to the Reproducing Kernel Hilbert Space (RKHS) $\mathcal{H}_k$. Mainstream works on kernelized bandits focus on a subgaussian noise model in which observations of the form $f(\mathbf{x}_t)+\epsilon_t$, being $\epsilon_t$ a subgaussian noise, are available (Chowdhury and Gopalan, 2017). Differently, we focus on the case in which we observe realizations $y_t \sim \text{Ber}(f(\mathbf{x}_t))$ sampled from a Bernoulli distribution with parameter $f(\mathbf{x}_t)$. While the Bernoulli model has been investigated successfully in multi-armed bandits (Garivier and Capp\'e, 2011), logistic bandits (Faury et al., 2022), bandits in metric spaces (Magureanu et al., 2014), it remains an open question whether tight results can be obtained for KBs. This paper aims to draw the attention of the online learning community to this open problem.
翻译:我们考虑核化老虎机(KBs)来优化属于再生核希尔伯特空间(RKHS)$\mathcal{H}_k$的函数$f : \mathcal{X} \rightarrow [0,1]$。核化老虎机的主流研究工作主要关注亚高斯噪声模型,其中可获形式为$f(\mathbf{x}_t)+\epsilon_t$的观测值,$\epsilon_t$为亚高斯噪声(Chowdhury and Gopalan, 2017)。与之不同,我们关注的是观测到从参数为$f(\mathbf{x}_t)$的伯努利分布中采样的实现$y_t \sim \text{Ber}(f(\mathbf{x}_t))$的情况。尽管伯努利模型已在多臂老虎机(Garivier and Capp\'e, 2011)、逻辑老虎机(Faury et al., 2022)、度量空间中的老虎机(Magureanu et al., 2014)中得到成功研究,但对于核化老虎机能否获得紧致结果仍是一个开放性问题。本文旨在引起在线学习社区对此开放性问题的关注。