We consider optimizing a function network in the noise-free grey-box setting with RKHS function classes, where the exact intermediate results are observable. We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions. We propose a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret. In addition, we propose a non-adaptive sampling based method along with its theoretical upper bound on the simple regret for the Mat\'ern kernel. We also provide algorithm-independent lower bounds on the simple regret and cumulative regret. Our regret bounds for GPN-UCB have the same dependence on the time horizon as the best known in the vanilla black-box setting, as well as near-optimal dependencies on other parameters (e.g., RKHS norm and network length).
翻译:我们考虑在无噪声灰盒设置下,利用再生核希尔伯特空间(RKHS)函数类优化函数网络,其中精确的中间结果可观测。假设网络结构已知(但构成网络的基函数未知),我们研究三类结构:(1) 链式结构:标量值函数的级联;(2) 多输出链式结构:向量值函数的级联;(3) 前馈网络:标量值函数构成的全连接前馈网络。我们提出基于序贯置信上界的算法GPN-UCB,并给出其累积遗憾的通用理论上界。此外,针对Mátern核函数,我们提出基于非自适应采样的方法及其简单遗憾的理论上界。我们还给出了独立于算法的简单遗憾与累积遗憾下界。GPN-UCB的遗憾界关于时间水平具有与标准黑盒设置中已知最佳结果相同的依赖关系,同时在其余参数(如RKHS范数与网络长度)上达到近乎最优的依赖关系。