We propose sVIRGO, a scalable virtual tree hierarchical framework for large-scale distributed systems. sVIRGO constructs virtual hierarchical trees directly on physical nodes, allowing each node to assume multiple hierarchical roles without overlay networks. The hierarchy preserves locality and is organized into configurable layers within regions. Coordination across thousands of regions is achieved via virtual upper-layer roles dynamically mapped onto nodes up to the top layer. Each region maintains multiple active coordinators that monitor local health and perform dynamic re-selection if failures occur. Temporary drops below the minimum threshold do not compromise coordination, ensuring near-zero recovery latency, bounded communication overhead, and exponentially reduced failure probability while maintaining safety, liveness, and robustness under mobile, interference-prone, or adversarial conditions. Communication is decoupled from the hierarchy and may use multi-frequency wireless links. Two message hop strategies are supported: (i) with long-distance infrastructure-assisted channels, coordinators exploit the virtual tree to minimize hops; (ii) without such channels, messages propagate via adjacent regions. sVIRGO also supports Layer-Scoped Command Execution. Commands and coordination actions are executed within the scope of each hierarchical layer, enabling efficient local and regional decision-making while limiting unnecessary global propagation.
翻译:本文提出sVIRGO,一种面向大规模分布式系统的可扩展虚拟树层次化框架。sVIRGO直接在物理节点上构建虚拟层次树,使每个节点能够承担多种层次化角色而无需覆盖网络。该层次结构保持局部性,并在区域内组织为可配置的层级。通过将虚拟上层角色动态映射至直至顶层的节点,实现了跨数千个区域的协同。每个区域维护多个活跃协调器,用于监控本地运行状态并在发生故障时执行动态重选。临时低于最小阈值的情况不会破坏协调机制,从而在移动性、易受干扰或对抗性条件下确保近乎零的恢复延迟、有界的通信开销以及指数级降低的故障概率,同时保持安全性、活跃性与鲁棒性。通信机制与层次结构解耦,可采用多频段无线链路。支持两种消息跳转策略:(i)在具备远距离基础设施辅助信道时,协调器利用虚拟树实现跳数最小化;(ii)无此类信道时,消息通过相邻区域传播。sVIRGO同时支持层域限定命令执行。命令与协调操作在各层级范围内执行,从而实现高效的本地与区域决策,同时限制不必要的全局传播。