We study the problem of federated contextual combinatorial cascading bandits, where $|\mathcal{U}|$ agents collaborate under the coordination of a central server to provide tailored recommendations to the $|\mathcal{U}|$ corresponding users. Existing works consider either a synchronous framework, necessitating full agent participation and global synchronization, or assume user homogeneity with identical behaviors. We overcome these limitations by considering (1) federated agents operating in an asynchronous communication paradigm, where no mandatory synchronization is required and all agents communicate independently with the server, (2) heterogeneous user behaviors, where users can be stratified into $J \le |\mathcal{U}|$ latent user clusters, each exhibiting distinct preferences. For this setting, we propose a UCB-type algorithm with delicate communication protocols. Through theoretical analysis, we give sub-linear regret bounds on par with those achieved in the synchronous framework, while incurring only logarithmic communication costs. Empirical evaluation on synthetic and real-world datasets validates our algorithm's superior performance in terms of regrets and communication costs.
翻译:我们研究了联邦上下文组合级联赌博机问题,其中$|\mathcal{U}|$个智能体在中央服务器的协调下合作,为$|\mathcal{U}|$个相应用户提供个性化推荐。现有工作要么采用同步框架,要求所有智能体完全参与和全局同步,要么假设用户同质性且具有相同行为。我们克服了这些局限性,考虑了:(1) 联邦智能体在异步通信范式下运行,无需强制同步,所有智能体独立与服务器通信;(2) 异质用户行为,用户可划分为$J \le |\mathcal{U}|$个潜在用户簇,每个簇表现出不同的偏好。针对该设置,我们提出了一种具有精细通信协议的UCB型算法。通过理论分析,我们给出了与同步框架相当的次线性遗憾界,同时仅产生对数级通信成本。在合成数据集和真实数据集上的实验评估验证了我们的算法在遗憾值和通信成本方面的优越性能。