H-EYE: Holistic Resource Modeling and Management for Diversely Scaled Edge-Cloud Systems

Computing systems have been evolving to be more pervasive, heterogeneous, and dynamic. An increasing number of emerging domains now rely on diverse edge to cloud continuum where the execution of applications often spans various tiers of systems with significantly heterogeneous computational capabilities. Resources in each tier are often handled in isolation due to scalability and privacy concerns. However, better overall resource utilization could be achieved if different tiers of systems had the means to communicate their computational capabilities. In this paper, we propose H-EYE, a universal approach to holistically capture diverse computational characteristics of edge-cloud systems with arbitrary topologies and to manage the assignment of tasks to the computational resources with the whole continuum in the scope. Our proposed work introduces two significant innovations: (1) We present a multi-layer, graph-based hardware (HW) representation and a modular performance modeling interface that could capture interactions and inference between different computing and communication resources in the system at desired level of detail. (2) We introduce a novel orchestrator mechanism that leverages the graph-based HW representation to hierarchically locate target devices that a given set of tasks could be mapped to. Orchestrator provides isolation for various device groups and allows hierarchical abstraction to scalably find mappings that satisfy system deadlines. The orchestrator internally relies on a novel traverser that takes shared resource slowdown into account. We demonstrate the utility and flexibility of H-EYE on edge-server systems that are deployed on the field in two different disciplines, improving up to 47% latency over baselines with less than 2% scheduling overhead

翻译：计算系统正朝着更加普适、异构和动态的方向演进。越来越多的新兴领域依赖于多样化的边云连续体，其中应用程序的执行通常跨越具有显著异构计算能力的多个系统层级。由于可扩展性和隐私考虑，每个层级的资源往往被孤立处理。然而，如果不同层级的系统能够相互通信其计算能力，则可以实现更好的整体资源利用率。本文提出H-EYE，一种通用方法，用于整体捕获具有任意拓扑结构的边云系统的多样化计算特性，并在整个连续体范围内管理任务到计算资源的分配。我们提出的工作包含两项重要创新：(1) 我们提出一种基于多层图结构的硬件表示方法和模块化性能建模接口，能够在所需细节层次上捕获系统中不同计算与通信资源之间的交互与推理。(2) 我们引入一种新颖的编排器机制，利用基于图的硬件表示，分层定位给定任务集可映射的目标设备。该编排器为不同设备组提供隔离，并允许通过层次化抽象可扩展地找到满足系统截止时间的映射方案。编排器内部依赖一种考虑共享资源减速的新型遍历器。我们在两个不同学科的实际部署边缘服务器系统上验证了H-EYE的实用性与灵活性，相比基线方法在低于2%调度开销的情况下实现了高达47%的延迟改进。