In video streaming over HTTP, the bitrate adaptation selects the quality of video chunks depending on the current network condition. Some previous works have applied deep reinforcement learning (DRL) algorithms to determine the chunk's bitrate from the observed states to maximize the quality-of-experience (QoE). However, to build an intelligent model that can predict in various environments, such as 3G, 4G, Wifi, \textit{etc.}, the states observed from these environments must be sent to a server for training centrally. In this work, we integrate federated learning (FL) to DRL-based rate adaptation to train a model appropriate for different environments. The clients in the proposed framework train their model locally and only update the weights to the server. The simulations show that our federated DRL-based rate adaptations, called FDRLABR with different DRL algorithms, such as deep Q-learning, advantage actor-critic, and proximal policy optimization, yield better performance than the traditional bitrate adaptation methods in various environments.
翻译:在基于HTTP的视频流媒体中,比特率自适应根据当前网络条件选择视频块的质量。先前的一些工作已应用深度强化学习算法,通过观测状态确定视频块的比特率,以最大化体验质量。然而,为构建能够预测多种环境(如3G、4G、WiFi等)的智能模型,从这些环境中观测到的状态必须发送至中央服务器进行集中训练。在本工作中,我们将联邦学习集成到基于深度强化学习的比特率自适应中,以训练适用于不同环境的模型。所提出框架中的客户端在本地训练其模型,仅将权重更新至服务器。仿真结果表明,我们提出的基于联邦深度强化学习的比特率自适应方法——FDRLABR,结合不同的深度强化学习算法(如深度Q学习、优势演员-评论家和近端策略优化),在多种环境下均优于传统比特率自适应方法。