Fog computing emerged as a promising paradigm to address the challenges of processing and managing data generated by the Internet of Things (IoT). Load balancing (LB) plays a crucial role in Fog computing environments to optimize the overall system performance. It requires efficient resource allocation to improve resource utilization, minimize latency, and enhance the quality of service for end-users. In this work, we improve the performance of privacy-aware Reinforcement Learning (RL) agents that optimize the execution delay of IoT applications by minimizing the waiting delay. To maintain privacy, these agents optimize the waiting delay by minimizing the change in the number of queued requests in the whole system, i.e., without explicitly observing the actual number of requests that are queued in each Fog node nor observing the compute resource capabilities of those nodes. Besides improving the performance of these agents, we propose in this paper a lifelong learning framework for these agents, where lightweight inference models are used during deployment to minimize action delay and only retrained in case of significant environmental changes. To improve the performance, minimize the training cost, and adapt the agents to those changes, we explore the application of Transfer Learning (TL). TL transfers the knowledge acquired from a source domain and applies it to a target domain, enabling the reuse of learned policies and experiences. TL can be also used to pre-train the agent in simulation before fine-tuning it in the real environment; this significantly reduces failure probability compared to learning from scratch in the real environment. To our knowledge, there are no existing efforts in the literature that use TL to address lifelong learning for RL-based Fog LB; this is one of the main obstacles in deploying RL LB solutions in Fog systems.
翻译:雾计算作为一种有前景的范式,旨在应对物联网(IoT)数据处理与管理带来的挑战。负载均衡(LB)在雾计算环境中对优化系统整体性能起着关键作用,它需要高效的资源分配以提升资源利用率、降低延迟并改善终端用户服务质量。本文通过最小化等待延迟,优化了面向隐私保护强化学习(RL)智能体的性能,该智能体旨在降低物联网应用的执行延迟。为维护隐私,这些智能体通过最小化整个系统内排队请求数量的变化(即无需显式观测每个雾节点中实际排队的请求数量及其计算资源能力)来优化等待延迟。除提升这些智能体的性能外,本文还提出了一种面向此类智能体的终身学习框架:部署期间使用轻量级推理模型以最小化动作延迟,仅当环境发生显著变化时方进行重新训练。为改进性能、降低训练成本并使智能体适应环境变化,我们探索了迁移学习(TL)的应用。TL将从源域获取的知识迁移至目标域,从而实现学得策略与经验的复用。TL还可用于在实际环境微调前,先在仿真环境中预训练智能体——相较于在真实环境中从零学习,这显著降低了失败概率。据我们所知,现有文献中尚无利用TL解决基于RL的雾计算LB终身学习的相关工作,这构成了在雾系统中部署RL负载均衡方案的主要障碍之一。