Space Network of Experts: Architecture and Expert Placement

Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs). Recognizing this advantage, space and AI conglomerates (e.g., SpaceX, Google) are actively investing in this vision. One key challenge, however, is the efficient distributed deployment of a large-scale LLM in a satellite network due to the limited onboard computing and communication resources. This gives rise to a placement problem that involves partitioning and mapping model components to satellites such that the fundamentally different model architecture and network topology can be reconciled to ensure low-latency token generation. To address this problem, we present the Space Network of Experts (Space-XNet) framework targeting the distributed execution of a popular mixture-of-experts (MoE) model in space. The proposed placement strategies are two-level: (1) layer placement, which assigns MoE layers to satellite subnets; and (2) intra-layer expert placement, which assigns individual experts to satellites associated with the same layer/subnet. For layer placement, we exploit the ring-like communication pattern of autoregressive inference to partition the satellite constellation along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer. Based on this architecture, we formulate and solve an optimization problem for intra-layer expert placement to map experts with heterogeneous activation probabilities onto satellites. The derived strategy reveals an intuitive principle: a frequently activated expert should be mapped to a satellite on a routing path with low expected latency. Experiments over a thousand-satellite constellation show that Space-XNet achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.

翻译：利用连续高效太阳能采集，太空数据中心被设想为执行高能耗大型语言模型（LLMs）的有前景平台。认识到这一优势，太空与人工智能巨头（例如SpaceX、谷歌）正积极投资于这一愿景。然而，关键挑战在于，由于星上计算与通信资源有限，如何在卫星网络中高效分布式部署大规模LLM。这引出了一个部署问题，即需要将模型组件划分并映射到卫星上，以使本质上不同的模型架构与网络拓扑相协调，从而保证低延迟的令牌生成。为解决此问题，我们提出太空专家网络（Space-XNet）框架，旨在太空中分布式执行流行的混合专家（MoE）模型。所提出的部署策略分为两层：（1）层级部署，将MoE层分配到卫星子网中；（2）层内专家部署，将单个专家分配至与同一层/子网关联的卫星上。对于层级部署，我们利用自回归推理的环状通信模式，将卫星星座沿轨道方向划分为排列成环的子网，每个子网承载一个MoE层。基于此架构，我们建立并求解了一个层内专家部署的优化问题，以将具有异构激活概率的专家映射到卫星上。推导出的策略揭示了一个直观原则：频繁激活的专家应映射到期望延迟较低的路由路径上的卫星。在包含数千颗卫星的星座上的实验表明，与传统的随机部署和消融部署策略相比，Space-XNet实现了至少三倍的延迟降低。