In this paper, we present the first sublinear $\alpha$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback, where $\alpha$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$-submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monotone functions with matroid constraints, (iii) non-monotone functions with matroid constraints, (iv) non-monotone functions without constraints, and (v) monotone functions without constraints. We transform approximation algorithms for offline $k$-submodular maximization problems into online algorithms through the offline-to-online framework proposed by Nie et al. (2023a). A key contribution of our work is analyzing the robustness of the offline algorithms.
翻译:本文针对具有完全赌博反馈的在线k-子模优化问题,首次提出了亚线性α-遗憾界,其中α为对应的离线近似比。具体而言,我们针对多种k-子模随机组合多臂赌博机问题提出了在线算法,包括:(i) 单调函数与个体规模约束,(ii) 具有拟阵约束的单调函数,(iii) 具有拟阵约束的非单调函数,(iv) 无约束的非单调函数,以及(v) 无约束的单调函数。我们通过Nie等人(2023a)提出的离线-在线转换框架,将离线k-子模最大化问题的近似算法转化为在线算法。本工作的一个关键贡献在于分析了离线算法的鲁棒性。