End-to-end routing in Low Earth Orbit (LEO) satellite constellations (LSatCs) is a complex and dynamic problem. The topology, of finite size, is dynamic and predictable, the traffic from/to Earth and transiting the space segment is highly imbalanced, and the delay is dominated by the propagation time in non-congested routes and by the queueing time at Inter-Satellite Links (ISLs) in congested routes. Traditional routing algorithms depend on excessive communication with ground or other satellites, and oversimplify the characterization of the path links towards the destination. We model the problem as a multi-agent Partially Observable Markov Decision Problem (POMDP) where the nodes (i.e., the satellites) interact only with nearby nodes. We propose a distributed Q-learning solution that leverages on the knowledge of the neighbours and the correlation of the routing decisions of each node. We compare our results to two centralized algorithms based on the shortest path: one aiming at using the highest data rate links and a second genie algorithm that knows the instantaneous queueing delays at all satellites. The results of our proposal are positive on every front: (1) it experiences delays that are comparable to the benchmarks in steady-state conditions; (2) it increases the supported traffic load without congestion; and (3) it can be easily implemented in a LSatC as it does not depend on the ground segment and minimizes the signaling overhead among satellites.
翻译:低地球轨道(LEO)卫星星座(LSatCs)中的端到端路由是一个复杂且动态的问题。其有限规模的拓扑结构具有动态性和可预测性,从地球到/来自地球以及穿越空间段(space segment)的流量高度不平衡,而在非拥塞路由中,时延主要由传播时间决定;在拥塞路由中,时延则受星际链路(ISLs)排队时间主导。传统路由算法依赖与地面或其他卫星的过度通信,且对通往目的地的路径链路特性描述过于简化。我们将该问题建模为多智能体部分可观测马尔可夫决策问题(POMDP),其中节点(即卫星)仅与邻近节点交互。我们提出了一种分布式Q-learning解决方案,该方案利用邻居信息及各节点路由决策的相关性。我们将结果与两种基于最短路径的集中式算法进行对比:一种算法以使用最高数据速率链路为目标,另一种"精灵算法"(genie algorithm)则能获知所有卫星的瞬时排队延迟。我们的方法在各方面均呈现积极结果:(1)在稳态条件下,其时延与基准方法相当;(2)在不引发拥塞的情况下提升了可承载的流量负载;(3)易于在LSatC中实现,因其不依赖地面段且能最小化卫星间的信令开销。