Cloud computing is an attractive technology for providing computing resources over the Internet. Task scheduling is a critical issue in cloud computing, where an efficient task scheduling method can improve overall cloud performance. Since cloud computing is a large-scale and geographically distributed environment, traditional scheduling methods that allocate resources in a centralized manner are ineffective. Besides, traditional methods are difficult to make rational decisions timely when the external environment changes. This paper proposes a decentralized BDI (belief-desire-intention) agent-based scheduling framework for cloud computing. BDI agents have advantages in modelling dynamic environments because BDI agents can update their beliefs, change desires, and trigger behaviours based on environmental changes. Besides, to avoid communication stuck caused by environmental uncertainties, the asynchronous communication mode with a notify listener is employed. The proposed framework covers both the task scheduling and rescheduling stages with the consideration of uncertain events that can interrupt task executions. Two agent-based algorithms are proposed to implement the task scheduling and rescheduling processes, and a novel recommendation mechanism is presented in the scheduling stage to reduce the impact of information synchronization delays. The proposed framework is implemented by JADEX and tested on CloudSim. The experimental results show that our framework can minimize the task makespan, balance the resource utilization in a large-scale environment, and maximize the task success rate when uncertain events occur.
翻译:云计算是一种通过互联网提供计算资源的具有吸引力的技术。任务调度是云计算中的关键问题,高效的任务调度方法能够提升云系统的整体性能。由于云计算具有大规模和地理分布式特性,传统的集中式资源调度方法难以有效运作。此外,当外部环境发生变化时,传统方法难以实时做出合理决策。本文提出了一种基于BDI(信念-愿望-意图)Agent的去中心化云计算调度框架。BDI Agent在动态环境建模方面具有优势,因为其能够根据环境变化更新信念、改变愿望并触发行为。同时,为避免环境不确定性导致的通信阻塞,采用了带通知监听器的异步通信模式。该框架综合考虑了可能中断任务执行的不确定事件,涵盖了任务调度与重调度两个阶段。本文提出了两种基于Agent的算法分别实现任务调度与重调度过程,并在调度阶段引入新型推荐机制以降低信息同步延迟的影响。该框架基于JADEX实现并在CloudSim上进行了测试。实验结果表明,该框架能够最小化任务完成时间、均衡大规模环境下的资源利用率,并在不确定事件发生时最大化任务成功率。