Emerging smart grid applications analyze large amounts of data collected from millions of meters and systems to facilitate distributed monitoring and real-time control tasks. However, current parallel data processing systems are designed for common applications, unaware of the massive volume of the collected data, causing long data transfer delay during the computation and slow response time of smart grid systems. A promising direction to reduce delay is to jointly schedule computation tasks and data transfers. We identify that the smart grid data analytic jobs require the intermediate data among different computation stages to be transmitted orderly to avoid network congestion. This new feature prevents current scheduling algorithms from being efficient. In this work, an integrated computing and communication task scheduling scheme is proposed. The mathematical formulation of smart grid data analytic jobs scheduling problem is given, which is unsolvable by existing optimization methods due to the strongly coupled constraints. Several techniques are combined to linearize it for adapting the Branch and Cut method. Based on the topological information in the job graph, the Topology Aware Branch and Cut method is further proposed to speed up searching for optimal solutions. Numerical results demonstrate the effectiveness of the proposed method.
翻译:新兴的智能电网应用通过分析从数百万计量设备和系统采集的海量数据,以支持分布式监控与实时控制任务。然而,当前并行数据处理系统面向通用应用设计,未能感知采集数据的巨大体量,导致计算过程中数据传输延迟过长,进而造成智能电网系统响应缓慢。减少延迟的一个有前景方向是协同调度计算任务与数据传输。我们发现,智能电网数据分析作业要求不同计算阶段间的中间数据有序传输,以避免网络拥塞。这一新特性使现有调度算法难以有效运行。本文提出一种计算与通信任务联合调度方案。首先给出智能电网数据分析作业调度问题的数学建模,该问题因强耦合约束而无法通过现有优化方法求解。通过组合多种技术对模型进行线性化处理,使其适用于分支切割法。进一步,基于作业图中的拓扑信息,提出拓扑感知分支切割法以加速最优解搜索。数值结果验证了所提方法的有效性。