SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks

Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs). Recognizing this advantage, space and AI conglomerates (e.g., SpaceX, Google) are actively investing in this vision. One key challenge, however, is the efficient distributed deployment of a large-scale LLM in a satellite network due to the limited onboard computing and communication resources. This gives rise to a placement problem that involves partitioning and mapping model components to satellites such that the fundamentally different model architecture and network topology can be reconciled to ensure low-latency token generation. To address this problem, we present the Space Network of Mixture-of-Experts (SpaceMoE) framework targeting the distributed execution of a popular mixture-of-experts (MoE) model in space. The proposed placement strategies are two-level: (1) layer placement, which assigns MoE layers to satellite subnets; and (2) intra-layer expert placement, which assigns individual experts to satellites associated with the same layer/subnet. For layer placement, we exploit the ring-like communication pattern of autoregressive inference to partition the satellite constellation along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer. Based on this architecture, we formulate and solve an optimization problem for intra-layer expert placement to map experts with heterogeneous activation probabilities onto satellites. The derived strategy reveals an intuitive principle: a frequently activated expert should be mapped to a satellite on a routing path with low expected latency. Experiments over a thousand-satellite constellation show that SpaceMoE achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.

翻译：利用高效的连续太阳能采集，太空数据中心被构想为执行高能耗大语言模型（LLM）的一个有前景的平台。认识到这一优势，航天与AI巨头（如SpaceX、谷歌）正积极投入这一愿景。然而，一个关键挑战在于，由于星载计算和通信资源有限，如何在卫星网络中高效分布式部署大规模LLM。这引发了一个放置问题，涉及将模型组件划分并映射到卫星上，以使根本不同的模型架构与网络拓扑相协调，从而确保低延迟的令牌生成。为解决这一问题，我们提出了太空混合专家网络（SpaceMoE）框架，旨在空间中实现流行的混合专家（MoE）模型的分布式执行。所提出的放置策略分为两个层次：（1）层放置，将MoE层分配给卫星子网；（2）层内专家放置，将单个专家分配给与同一层/子网关联的卫星。对于层放置，我们利用自回归推理的环状通信模式，将卫星星座沿轨道方向划分为排列成环的子网，每个子网承载一个MoE层。基于此架构，我们公式化并求解了一个层内专家放置的优化问题，以将具有异构激活概率的专家映射到卫星上。推导出的策略揭示了一个直观原则：频繁激活的专家应被映射到预期延迟较低的路由路径上的卫星。在包含数千颗卫星的星座上的实验表明，与传统的随机和基于消融的放置策略相比，SpaceMoE至少实现了三倍的延迟降低。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《执行无人机蜂群任务：智能体增强大语言模型推理赋能无人机物联网》

专知会员服务

13+阅读 · 5月12日

《面向空军的知识图谱即解决方案：领域知识有效融入大语言模型》

专知会员服务

56+阅读 · 2025年11月8日

《探索军事背景下共享大语言模型：AI助手与智能体部署中可扩展性与效率的早期洞察》（含44页slides）

专知会员服务

23+阅读 · 2025年10月31日

上交大推出首个AI智能体协议全面综述：从碎片化到互联互通的智能体网络

专知会员服务

25+阅读 · 2025年4月30日