Efficient task scheduling in large-scale distributed systems presents significant challenges due to dynamic workloads, heterogeneous resources, and competing quality-of-service requirements. Traditional centralized approaches face scalability limitations and single points of failure, while classical heuristics lack adaptability to changing conditions. This paper proposes a decentralized multi-agent deep reinforcement learning (DRL-MADRL) framework for task scheduling in heterogeneous distributed systems. We formulate the problem as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) and develop a lightweight actor-critic architecture implemented using only NumPy, enabling deployment on resource-constrained edge devices without heavyweight machine learning frameworks. Using workload characteristics derived from the publicly available Google Cluster Trace dataset, we evaluate our approach on a 100-node heterogeneous system processing 1,000 tasks per episode over 30 experimental runs. Experimental results demonstrate 15.6% improvement in average task completion time (30.8s vs 36.5s for random baseline), 15.2% energy efficiency gain (745.2 kWh vs 878.3 kWh), and 82.3% SLA satisfaction compared to 75.5% for baselines, with all improvements statistically significant (p < 0.001). The lightweight implementation requires only NumPy, Matplotlib, and SciPy. Complete source code and experimental data are provided for full reproducibility at https://github.com/danielbenniah/marl-distributed-scheduling.
翻译:大规模分布式系统中的高效任务调度面临动态工作负载、异构资源以及相互竞争的服务质量需求等重大挑战。传统集中式方法存在可扩展性瓶颈和单点故障问题,而经典启发式算法难以适应动态变化场景。本文提出一种用于异构分布式系统任务调度的去中心化多智能体深度强化学习框架。我们将该问题建模为去中心化部分可观测马尔可夫决策过程,并开发一种仅基于NumPy实现的轻量级行动者-评论家架构,使其无需依赖重型机器学习框架即可部署于资源受限的边缘设备。采用公开Google集群追踪数据集的工作负载特征,我们在30次实验中评估了该框架在包含100个异构节点的系统上单轮处理1000个任务的表现。实验结果表明,与随机基线相比,平均任务完成时间提升15.6%(30.8秒对比36.5秒),能效增益达15.2%(745.2千瓦时对比878.3千瓦时),SLA满足率达82.3%(基线为75.5%),所有改进均具有统计显著性(p < 0.001)。该轻量级实现仅需依赖NumPy、Matplotlib和SciPy库。完整源代码与实验数据已发布于https://github.com/danielbenniah/marl-distributed-scheduling以确保完全可复现。